4/08/2014

Google Glass in Warehouse Automation

Imagine that you operate a huge warehouse, where you store all the awesome goods you sell.
Well, one of our customers does.
For instance, if you run a supermarket, your process consists of at least these major modules:
  1. Online order management & checkout.
  2. Order packaging.
  3. Incoming goods processing and warehouse logistics.
  4. Order delivery.
Here's an oversimplified picture of your flow:
Modern Supply Chain
Source: http://www.igd.com/our-expertise/Supply-chain/In-store/3459/On-Shelf-Availability

If you relax for a minute, you could probably brainstorm several ideas of how Google Glass can be applied in warehouse.
But let's focus on 1 link on that picture: "Goods received and unloaded". If you break it down into pieces, it seems to be a rather simple process:
  1. Truck comes to your warehouse.
  2. You unload the truck.
  3. Add all items into your Warehouse Management system.
  4. Place all items into specific place inside the warehouse.
But it gets complicated, when you need to enter specific data manually: expiry dates, amount, etc. We thought about optimizing person's performance during this phase. Would it be great, if you could just
  1. pick a box with both of your hands
  2. handle it to the shelf
  3. go back to the truck & repeat
On one hand, you have some spare seconds, when you grab a box and handle it. On the other, Google Glass has camera and voice recognition.

We combined both of the approaches. And that's what we created:


Pretty cool, yeah? Here you can see automation of all basic tasks we mentioned above. It is incredible how useful Google Glass is for warehouse automation.

In the next section we're going to dive deep into technical details, so in case you're software engineer you can find some interesting pieces of code below.

Working with Camera in Google Glass

We wanted to let person grab a package and scan it's barcode at the same time. First of all, we tried implementing barcode scanning routines for Glass. That's great that Glass is actually an Android device.
So, as usual, add permissions to AndroidManifest.xml, initialize your camera and have fun.
<uses-feature android:name="android.hardware.camera" />
<uses-feature android:name="android.hardware.camera.autofocus" />

<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
But wait, what the hell is going on? Why does my screen shows something similar to this:


Glass is a beta product, so you need some hacks. When you implement surfaceChanged(...) method, don't forget to add parameters.setPreviewFpsRange(30000, 30000); call. Eventually, your surfaceChanged(...) should look like this:

public void surfaceChanged(SurfaceHolder holder, int format, int width, int height) {
    ...
    Camera.Parameters parameters = mCamera.getParameters();
    Camera.Size size = getBestPreviewSize(width, height, parameters);
    parameters.setPreviewSize(size.width, size.height);
    parameters.setPreviewFpsRange(30000, 30000);
    mCamera.setParameters(parameters);

    mCamera.startPreview();
    ...
}
That's the way you can make it work.
P.S. Unfortunately, Glass has now just 1 focus mode -- "infinity". I hope, things will get better in the future.

Working with Barcodes in Google Glass

Once you see a clear picture inside your prism, let's proceed with barcode scanning. There're some barcode scanning libraries out there: zxing, zbar, etc. We grabbed a copy of zbar library and integrated it into our project.
  1. Download a copy of it.
  2. Copy armeabi-v7a folder and zbar.jar file into libs folder of your project.
  3. Use it with camera:
Initialise JNI bridge for zbar library:
static {
    System.loadLibrary("iconv");
}
Add to onCreate(...) of your activity:
setContentView(R.layout.activity_camera);
// ...
scanner = new ImageScanner();
scanner.setConfig(0, Config.X_DENSITY, 3);
scanner.setConfig(0, Config.Y_DENSITY, 3);
And create Camera.PreviewCallback instance like this. You'll scan image and receive scanning results in it.
Camera.PreviewCallback previewCallback = new Camera.PreviewCallback() {
    public void onPreviewFrame(byte[] data, Camera camera) {
        Camera.Size size = camera.getParameters().getPreviewSize();

        Image barcode = new Image(size.width, size.height, "NV21");
        barcode.setData(data);
        barcode = barcode.convert("Y800"); 
        int result = scanner.scanImage(barcode);

        if (result != 0) {
            SymbolSet syms = scanner.getResults();
            for (Symbol sym : syms) {
                doSmthWithScannedSymbol(sym);
            }
        }
    }
};
You can skip barcode.convert("Y800") call and scanner would still work. Just keep in mind that Android camera returns images in NV21 format by default. zbar's ImageScanner supports only Y800 format. That's it. Now you can scan barcodes with your Glass :)

Handling Voice input in Google Glass

Apart from Camera, Glass has some microphones, which let you control it via voice. Voice control looks natural here, although people around you would find it disturbing. Especially, when it can't recognize "ok glass, google what does the fox say" 5 times in a row.
As you can remember, we want to avoid manual input of specific data from packages. Some groceries have expiry date. Let's implement recognition of expiry date via voice. In this way, a person would take a package with both hands, scan a barcode, say expiry date while handling a package and get back to another package.
From technical standpoint, we need to solve 2 issues:
  1. perform speech to text recognition
  2. perform date extraction via free-form text analysis
Task #1 can be solved via Google Speech Recognition API in Android. In order to use it from Glass, you need to use default Android Intents:
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);   
intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Say expiry date:");   
startActivityForResult(intent, EXPIARY_DATE_REQUEST);
And override onActivityResult(...) in your Activity, of course:
@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
    if (resultCode == Activity.RESULT_OK && requestCode == EXPIARY_DATE_REQUEST) {
        doSmthWithVoiceResult(data);
    } else {
        super.onActivityResult(requestCode, resultCode, data);
    }
}

public void doSmthWithVoiceResult(Intent intent) {
    List<String> results = intent.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
    Log.d(TAG, ""+results);
    if (results.size() > 0) {
        String spokenText = results.get(0);
        doSmthWithDateString(spokenText);
    }
}
Once we have free-form text from Google Voice Recognition API we need to solve task #2: get contextual information from it. We need to understand, which date is hidden behind phrases like:
  • in 2 days
  • next Thursday
  • 25th of May
In order to do this, you can either write your own lexical parser, or find a third-party library for that. Eventually, we found an awesome library called natty. It does exactly this: it is a natural language date parser written in Java. You can even try it online here.
Here's how you can use it in your project. Add natty.jar to your project. If you use maven, then add it via:
<dependency> 
    <groupId>com.joestelmach</groupId>
    <artifactId>natty</artifactId>
    <version>0.8</version>
</dependency>
If you just copy jars to your libs folder you'll need natty with dependencies. Download all of them:
  • stringtemplate-3.2.jar
  • antlr-2.7.7.jar
  • antlr-runtime-3.2.jar
  • natty-0.8.jar
And use Parser class in your source:
// doSmthWithDateString("in 2 days"); 

public static Date doSmthWithDateString(String text) {
    Parser parser = new Parser();
    List<DateGroup> groups = parser.parse(text);
    for (DateGroup group : groups) {
        List<Date> dates = group.getDates();
        Log.d(TAG, "PARSED: " + dates);
        if (dates.size() > 0) {
            return dates.get(0);
        }
    }
    return null;
}
That's it. natty does a pretty job of transforming your voice into Date instances.

Instead of Summary

Glass is an awesome device, but it has some issues now. Even though it's still in beta, you'll get tons of joy, while developing apps for it! So grab a device, download SDK and have fun!
P.S. You can find full source code for this example here: https://github.com/eleks/glass-warehouse-automation.

3/06/2014

FakeItEasy: be careful when wrapping an existing object


Isolation frameworks are very popular today — they are an important part of your automated tests. They allow to easily isolate dependencies you don’t control, such as file systems or network connections, using fake but controllable objects generated on-the-fly via Reflection. On my last project I used FakeItEasy — a nice framework with clean API. I like it very much (RIP, Moq)). This post was inspired by a real-life situation.

Fake for an already created object

Normally you create fakes for interfaces or abstract objects. But suppose you want to partially fake a concrete object that has virtual methods:

Dog dog = A.Fake<Dog>();

Where the class Dog is implemented like this:

public class Dog
{
    public virtual string Bark()
    {
        return "Bark!";
    }

    public virtual string BarkBark()
    {
        return Bark() + Bark();
    }
}

For that fake dog both methods will return an empty string. Here is sample test to confirm this:

public void Bark_DespiteExistingImplementation_ReturnsEmptyString()
{
    Dog dog = A.Fake<Dog>();

    string result = dog.Bark();

    Assert.AreEqual("", result);
}

Hmm... As you can see the method Bark() already has the logic that returns string "Bark!", but it is thrown away for the fake object, as if it were based on the completely abstract IDog interface. The same behavior applies to NSubstitute framework. Moq returns null instead of an empty string.
If you want the logic implemented in virtual methods to be preserved, you have to create a fake that wraps an existing object:

Dog dog = A.Fake<Dog>(x => x.Wrapping(realDog));

Lets write a test to confirm:

[Test]
public void Bark_ForWrappedObject_ReturnsImplementationFromThatObject()
{
    Dog realDog = new Dog();
    Dog dog = A.Fake<Dog>(x => x.Wrapping(realDog));

    string result = dog.Bark();

    Assert.AreEqual("Bark!", result);
}

Great. Now, suppose you want to override the method Bark() on faked object to return the string "Quack!". Simple to do:

A.CallTo(() => dog.Bark()).Returns("Quack!");

And now the question: what do you expect to be returned from the method BarkBark()? It calls the virtual method Bark() that you overrode with A.CallTo() construct. But...

[Test]
public void BarkBark_ForOverriddenBark_UsesOverridenImplementation()
{
    Dog realDog = new Dog();
    Dog dog = A.Fake<Dog>(x => x.Wrapping(realDog));
    A.CallTo(() => dog.Bark()).Returns("Quack!");

    string result = dog.BarkBark();

    // Not what you expected    
    Assert.AreEqual("Quack!Quack!", result); // Oops, "Bark!Bark!"
}

For those who know how virtual methods work this looks very counter-intuitive. So be aware of this behavior!
But if you do need to override the method Bark() in the traditional "virtual manner", what should you do? Unfortunately, FakeItEasy cannot help you in this situation. You have to manually code the FakeDog class and override the method Bark():

public class FakeDog : Dog
{
    public override string Bark()
    {
        return "Quack!";
    }
}


[Test]
public void BarkBark_ForManualFake_UsesOverridenImplementation()
{
    Dog dog = new FakeDog();

    string result = dog.BarkBark();

    Assert.AreEqual("Quack!Quack!", result); // SUCCESS!
}

By the way, with NSubstitute framework you'll experience the very same issue. Surprisingly, Moq has a simple solution to the problem — just set the CallBase property to true.


Conclusion

Try to avoid situations when your tests use  the same object as SUT (System Under Test) and a fake. Like in our example: we tested the class Dog which was our SUT and we also faked a part of it (method Bark()). As you can see, this can lead to surprising results when used in combination with some of the popular isolation frameworks.

2/21/2014

Why Google Glass will fail and why this won’t stop smartglasses’ success

Smartglasses are probably the most promising kind of wearable devices currently on the market. And Google Glass is the most interesting device among them. Yes, Google Glass is sexy and cool. It has great design; it is small and futuristic. After all, it feels like a typical Google product - simple and great. Nevertheless, I think it won't be successful as a product and will eventually fail.
Disclaimer: the following is my own opinion and does not express the official position of ELEKS or its R&D team. In fact, some of the team members argue with me a lot that Glass is the best device ever, it will conquer the world and other things like that. Of course, I'm exaggerating here a bit, but still it looks very likely for me that Glass will eventually fail as a commercial product.
So, what's the problem with Google Glass? Well, there are two aspects of it:
1. The device itself. I'll dwell upon its drawbacks in the next section.
2. Its positioning and marketing. I'll show you some interesting historic analogies below and try to project them on to the future.

Five Reasons Why Google Glass Sucks


So, what's wrong with the device? When you watch the ads or listen to Google employees who evangelize Glass, it looks so futuristic and cool that you're starting to believe the future is here and search for the Order button. But things change when you actually get it. We bought one for our R&D team a few months ago and... well, I wouldn't say it is a big disappointment but the device is very far from being market-ready. Here is the list of 5 things that we found very annoying about Google Glass:

2/10/2014

Data Science for Targeted Advertising: How to display relevant ads by leveraging past user behavior.

Online advertising industry is bigger than you thought


Past decade have seen a huge growth in online advertising. It is the huge industry with brands expected to pour almost over 30 billion of dollars in 2014. Online advertising provides companies with instant feedback and publishers has more knowledge about their users. Advertisers are very interested in precisely targeted ads. In particular they want to spend the smallest amount of money and get the maximum increase in profit. This is resolved by applying the targeted advertising. The problem involves determining where, when and to whom display particular advertisement on the Internet. Advertising systems deliver ads based on demographic, contextual or behavioral attributes. One of the examples are sponsored searches. It is most profitable business model on the web and accounts for the huge amount of income for the top search engines Google, Yahoo and Bing. It generates at least 25 billion dollars per year.

There are couple of usable methods to do targeted advertising:
  • Demographic Targeting – this approach defines targeted audience by gender, age, income, location, etc. It is old and efficient approach, because it is easy to project behavior for products categories. Demographic targeting is popular since it’s easy to understand and implement. It provides advertiser transparence and control over the audience selected for targeting.
  • Property Targeting – is a simple and popular targeting mechanism. The advertiser specifies set of pages where the ad should be shown. For example the company who sells tracks could show advertisement on website about vehicles. 
  • Behavioral Targeting – provides an approach to serve ads to users leveraging the past behavior of the user (searches, site visits, purchases). The most valuable resource for behavioral targeting is network traffic of particular user. The more such data you have, the better targeting result you will achieve. Thus, even local ISP companies can provide more accurate ads for consumer than Google or Yahoo.

Real-time bidding exchanges – de facto standard for targeted advertising


Online advertising industry has grown significantly during the past few years, with extensive usage of the real-time bidding exchanges (RTB). This auction website allows advertisers to bid on the opportunity to place the online display ads in real time.  Advertisers are integrated with exchange system via API and collect variety of data to decide whether or not to bid and at what price. This has created a simple and efficient method for companies to target advertisements to particular users. As the industry standard, showing the display ad to the consumer is called “impression”. The auctions run in real time and instantly triggers when user navigates to the web page and taking place during the time the page if completely loaded in the user’s browser. During the auction, information about the location of the potential advertisement along with user information are passed to bidders in form of bid request.  This data is often appended with information previously collected by advertisers about the user. When an auction starts, potential advertiser makes the decision if it wants to bid on this impression, at which price and what advertisement to show in case it wins the auction. There are billions of such real-time transactions each day and advertisers require large-scale solution to handle such auctions in milliseconds.

Such complicated ecosystems is a perfect opportunity for applying machine learning techniques, which play a key role in the ad bidding optimization process, increasing the targeting accuracy and  reaching the ultimate goal from marketer’s perspective “Address the right browsers with the right message at the right moment and preferably at the right price”.

Improving ads relevance by applying Machine Learning techniques


The main task of machine learning system is to identify prospective customers – online users who have the higher propensity to purchase a specific product in near future after being displayed the advertisement. The ultimate goal is to build the system which will learn predictive models for each ad targeting automatically. One of the challenges of building such systems is that different ad campaigns could have different performance measures. However each of these criteria may be approximately represented as some ranking of potential purchases in terms of purchase propensity. A primary source of input features for behavioral targeting is user browser history, recorded as a set of web pages visited in the past. The target labels could be individual for each campaign and based on actual purchases of the specific product. From high level, this looks like an example of a straightforward predictive modeling problem. But if take a closer look, it appears, that it is impossible to obtain necessary amount of training data directly for this problem. First, probability of the purchase in next 7 days after seeing the ads is very low and is in range from 0.0000001 to 0.001, depending on the advertisement campaign. Second, the input feature vector includes more than one million features even in the simplest case (consider the user browsing history is encoded as set of hashed URLs). These dataset attributes involves difficulties in training process, however there are efficient approaches, which are designed to predict the consumer purchase propensity in such difficult circumstances.

Site visits as better purchase predictors than click through rate (CTR)


We know that probability of purchase after seeing the advertisement is rare event. This causes model training with highly imbalanced class distribution (skewed classes). The simplest and most widely used approach is to introduce proxy trained models. Currently, the most common proxy is clicks on advertisement. The efficiency of campaigns are often evaluated based on “click through rate” (CTR). As a result they are optimized towards increased CTR. In this approach clicks on advertisement are treated as positive samples. Hence instead of conversions, the model is trained using clicks, but the testset is still labeled by conversions. In recent study [1], this approach was tested against 10 different ad campaigns. The result implies that targeting based on clicks does not necessarily mean maximizing for conversions.

Figure 1. Improvement in prediction accuracy by using conversions for training instead of clicks. Testing is done using conversions in both cases.

Are there other good proxy candidates for evaluating and optimizing the advertising campaigns? Latest researches [2], answered this question. In contrast to clicks, site visits turned out generally to be good proxies for purchases.  Specifically, site visits do remarkably well as the basis for building models to target browsers who will purchase subsequent to being shown the ad. Even is some cases the models trained on site visits are producing better results than one trained on conversions. 

Figure 2. AUC performance distribution in with respect to purchase prediction of the models trained on clicks, site visits and purchases respectively.

The results show that site visitors are more likely tend to be the purchasers rather than ad clickers.

Dimensionality reduction techniques improve model accuracy. 


As mentioned earlier another difficulty in predicting the purchase is huge input feature space, which typically requires the dimensionality reduction. In most cases ad targeting system tracks over 100 million unique URLs, and any of them could be used in predictive model. It’s very expensive to build and store such high-dimensional models. However, number of dimensionality reduction techniques are available nowadays, but not all of them are well suited for ad targeting problem. 

The simplest method for massive binary feature space reduction is feature hashing. It transforms a bag of words into a bag of hashed IDs. Given a set of tokens and a hash function h(), we apply the hash function to each of the tokens and the new feature space is simply the set of hashed values. We can generate a column index for a given token with a hash function. The output of the hash function should be big enough to avoid collision with even a million unique tokens. The pseudocode is following:

function hash_vectorizer(features : string array, N : integer):
     x := new vector[N]
     for f in features:
         h := hash(f)
         x[h mod N] += 1
     return x

Dimensionality reduction results from hash collisions. For example, if a URLs set contains {intel.com, nytimes.com, nyu.edu}, and we have h(“intel.com”) = 6, h(“nyu.edu”) = 6 and h(“nytimes.com”) = 8 then, in the new space, the hashed URLs set has values for features 6 and 8. Hash functions are typically 32-bit or 64-bit, and to project into an arbitrary k-dimensional feature space, we compute h() mod k. 

Another approach is Contextual Categories. The web has a number of sources, both proprietary and free, that categorize specific web pages by their content. These categories serve as content-based groupings that can be used to reduce the dimensionality of the data. With category data, original feature space of URLs becomes a feature space of categories. 

There are many other techniques for dimensionality reduction including Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) which are proved to good alternatives for reducing the huge URL feature space.

Conclusion 


This article made a brief overview of the targeted advertising business, which is the multibillion industry and grows dramatically. Most of the big players on the online advertising market are working with Real-time bidding systems (RTB), which connects advertisers and publisher. RTB act as online auction allowing advertisers to bid on the opportunity to place the online display ads for particular user in real time. Right now in industry the key metric for measuring success of ad camping is click through rate (CTR), however recent studies presented that site visits are better conversion predictor than CTR. At first sight the machine learning for target advertising seems to be trivial. But after looking at problem more precisely, one may notice underlying difficulties including rare conversion, lack of training data and highly dimensional input feature space. However number of researches have been conducted, which identified the efficient solutions for solving mentioned difficulties and providing good models for predicting future conversion events.

References

  1. S. Pandey, M. Aly, A. Bagherjeiran, A. Hatch, P. Ciccolo, A. Ratnaparkhi, and M. Zinkevich. Learning to target: What works for behavioral targeting.
  2. B. Dalessandro, R. Hook, C. Perlich, F. Provost. Evaluating and Optimizing Online Advertising: Forget the click, but there are good proxies. 
  3. C. Perlich, B. Dalessandro, O. Stitelman, T. Raeder, F. Provost. Machine learning for targeted display advertising: Transfer learning in action. 


1/06/2014

The Secret Ingredient of Business Development

Over the past 20 years in IT business ELEKS has developed a “secret” ingredient that is deeply rooted in its values and has helped greatly both the company and its customers.  So let me share this secret ingredient with you.

We had learned that it is simply not enough to set a goal and just go for it. In order to maximize success we needed a vehicle that would be more reliable and independent of certain risks that exist in the software development industry. We have been applying it to everything since – the creation of software, workflow processes within the company as well as relationships with our customers. This key ingredient is – ecological systemic approach.