11/13/2012

Tesla K20 benchmark results

Recently we've developed a set of synthetic tests to measure NVIDIA GPU performance. We ran it on several test environments:


  • GTX 580 (PCIe-2, OS Windows, physical box)
  • GTX 680 (PCIe-2, OS Windows, physical box)
  • GTX 680 (PCIe-3, OS Windows, physical box)
  • Tesla K20Xm (PCIe-3, ECC ON, OS Linux, NVIDIA test data center)
  • Tesla K20Xm (PCIe-3, ECC OFF, OS Linux, NVIDIA test data center)
  • Tesla M2050 (PCIe-2, ECC ON, OS Linux, Amazon EC2)


Please note, that next generation Tesla K20 is also included into our results (many thanks to NVIDIA for their early access program).
You can find results at Google Docs. Benchmark sources are available at our GitHub account.
Stay tuned, we're going to make some updates on this.

UPD: Detailed results with charts and some conclusions: http://www.elekslabs.com/2012/11/nvidia-tesla-k20-benchmark-facts.html

11/12/2012

Windows Azure Backup Tools available on GitHub

Migrating applications to the cloud involves a big step in the way we deploy and maintain our software. While leveraging all the tasty features provided by cloud platforms, such as high availability and seamless scalability, a good IT professional also wants to make sure that the cloud version of the application is every bit as reliable and secure as the on-premises version.


From the operations team's point of view, there are numerous aspects of running a cloud application properly, typical of which are:
  • Application data must be regularly backed up, the time it takes to restore the data must be as little as possible. Quick restore means less downtime, less downtime means happier customers.
  • It is preferable for the cloud application to be portable, which means it can be moved back and forth between a cloud-hosted datacenter and your on-premises environment without any modifications to the source code.
  • Maintentance tasks should be automated and include as few steps as possible to reduce the probability of human error.
Nowadays, public cloud vendors offer quite different functionality as regards application maintenance. While some of them concentrate on rich web-based management UI, others invest their efforts in building a powerful API to automate these tasks. The more experienced and mature vendors do both. With this in mind, you have to weigh your typical operation tasks against the management features provided by concrete cloud vendor.

Having had some experience with migrating on-premises applications to Windows Azure, we must admit that while the new Metro-style management portal is quite pleasant and easy to use, it does not yet provide some features commonly required by our IT pros. For example, automatically backing up Windows Azure SQL Databases and restoring them locally is possible, but involves quite a lot of manual steps. Things become a little more difficult when you encounter such tasks as backing up data from on-premises applications to cloud storage as well as restoring such backups later: if you use private blob containers, managing such blobs is quite tedious because of the lack of UI tools.

In order to help the operations staff with common tasks, we have developed a few automated command-line tools that utilize various Windows Azure APIs behind the scenes. The source code is released under MIT License and is available on GitHub.

1. Backup Windows Azure SQL Database to Blob Storage.

This tool allows you to perform an automated backup of your SQL Database and store the backup to your Windows Azure Storage account as a blob (in BACPAC format). Later, this backup can be used to restore the database on another cloud SQL Database server as well as an on-premises Microsoft SQL Server instance. Internally, this tool utilizes DAC web service endpoints hosted on Windows Azure datacenters. Note that for every location the URL of the web service is different.

Usage example:
AzureSqlDbBlobBackup.exe --dac-service-url https://by1prod-dacsvc.azure.com/DACWebService.svc --db-server-name abcdef1234.database.windows.net --db-name Northwind --db-username db-admin --db-password db-admin-secret --storage-account northwindstorage --storage-account-key s0e1c2r3e4t== --blob-container backups --blob-name NorthwindBackup.bacpac --append-timestamp

2. Archive local files or folders to Blob Storage.

This tool allows you to upload zip-compressed copies of your local data to Windows Azure Storage, which can be helpful if you frequently use cloud as a reliable off-site storage for your digital assets.

Usage example:
ZipToAzureBlob.exe --source-path E:\MyData --compression-level=9 --storage-account northwindstorage --storage-account-key s0e1c2r3e4t== --blob-container backup --blob-name MyDataBackup.zip --append-timestamp

3. Download files from (private) Windows Azure Blobs.

The purpose of this tool is quite straightforward: it enables you to download a blob from Windows Azure Blob Storage to your local filesystem, which works especially good when the blobs are stored in a private container, thus not so easily downloadable from the management portal. This tool, combined with the zip-archiving tool above, provides a pretty quick and easy solution for automating the process of data backup/restore that utilizes reliable cloud storage.

Usage example:
DownloadAzureBlob.exe --storage-account northwindstorage --storage-account-key s0e1c2r3e4t== --blob-container backup --blob-name MyDataBackup.zip


Storage Emulator notice

Since these tools are primarily intended to be used in a production environment, we did not currently add support for Windows Azure Storage emulator (UseDevelopmentStorage=true), although stay tuned for upcoming updates to our GitHub repository.

11/09/2012

HTML5 Canvas: performance and optimization. Part 2: going deeper

Last time we were talking about JavaScript optimization, we used some basic optimization techniques to achieve better performance of Flood Fill algorithm on HTML5 Canvas. Today we’re going to go deeper.

We discussed results internally and came out with several low-level fixes. Here they go:

Minimize created objects count

First of all we thought about enormous number of objects we created during execution of our algorithm. As you probably remember, we’d applied fix named ‘Temp object creation’, when we tried to minimize number of performed arithmetic operations. It had negative effect on performance because of increased memory allocation and garbage collector overhead. So, the less objects you created – the better performance you get. It is not hard to notice that most of objects in our code are created here:
What if we don’t create new objects here at all? Let’s store in stack individual coordinates instead of creating wrapper objects. Sure, it makes code more complicated and unreadable, but performance is our main goal today. So, we came out with this:
Results follows:

Please note, we removed two bad fixes from the previous article. We’ve got nice results in all browsers, but in Safari results were really amazing: about 45% performance boost. If we remember bad “Temp object” fix from the previous article we notice that Safari was dramatically slower than other browsers after that fix, so such result is logical consequence of some issues Safari has with objects allocation and/or garbage collection.

Inline functions

Let’s go deeper. Most modern compilers do function inlining automatically or provide you with ability to mark function with some kind of inline attribute (think C’s inline keyword or AggressiveInliningAttribute from .NET 4.5). JavaScript don’t allow you to do that. Although function inlining might have dramatic performance effect in case you call function very often. We call isSameColor about 2 million times and setPixelColor about 700K times. Let’s try to inline them manually:
Again, it makes our code less readable and understandable, but we really want better performance, so we don’t care about code readability here.

Isn’t it amazing? Absolutely incredible results: we’ve got 70% boost on Firefox, 56% on IE and 36% on Safari. And what about Chrome? Surprisingly it is 9% slower. It seems that Google already implemented automatic function inlining in V8 optimizer and our manual fix is worse than theirs. Another interesting thing: with that fix Firefox is almost two times faster than Chrome, previous leader.

Optimize CPU cache utilization

There are one thing we never thought before in context of such a high-level language as JavaScript: CPU cache utilization. It is quite an interesting topic and it deserves separate article. As for now you can read more about it here, for example.
ImageData array has two dimensions that are packed into one dimensional array line by line. So, 3x3 pixel matrix with coordinates (x;y), is basically stored in memory like that:

Let’s look at the following lines in our code:
Let’s think about the order we access neighbor pixels if we are in (1;1):

So, we’re going left, then once again left, then right, then again right. According to best practices of cache utilization optimization it is better to access memory sequentially, because it minimizes a chance to have cache miss. So, what we need is something like that:

Let’s rewrite our dx,dy arrays:

Here is what we’ve got:

Well, the only browser reacted significantly was Chrome: we’ve got about 10% better performance with it. Other browsers were up to 3% faster which is in the area of statistical error. Anyway, this is interesting result that means we should pay attention even to such low-level optimizations when we write code on JavaScript – they are still important.

Fixing own bug – reorder if statements

Those of you who read inlined code carefully might already noticed that we actually did a mistake there:

We check pixel color and only then make sure that we don’t go outside array bounds. In statically typed language we’d got some kind of IndexOutOfBoundsException or Access Violation error here. But in JavaScript arrays are basically hash tables, so noting prevents you from accessing negative index here. But due to the cost of array element access operation it makes sense to check array bounds before checking colors:

Results are surprising:

Most of the browsers results were in the area of statistical error, but Chrome was more than two times faster and got its crown of fastest browser back in our benchmark! It is hard to tell what the reason of such a dramatic difference is. Maybe other browsers use instruction reordering and already applied that optimization by themselves, but it looks strange Chrome don’t do it. There also may be some hidden optimization heuristics in Chrome that help it understand that stack is used as simple array, not hash-table and it makes some significant optimization based on that fact.

Conclusions

Low-level optimizations matter even if you write your code in high-level language such as JavaScript. Code is still executed on same hardware as if you write it on C++.
Each JavaScript engine is different. You can have significant performance boost in one browser, but at the same time your fix may make your code slower in another browser. Test your code in all browsers your application must work with.
Keep the balance between code readability and performance considerations. Sometimes it makes sense to inline function even if it makes your code less readable. But always make sure that it brings you desired value: it doesn’t make sense to sacrifice code readability for 2 ms performance boost for code that is already fast enough.
Think about object allocation, memory usage and cache utilization, especially if you work with memory intensive algorithms such as Flood Fill.

You can find all the results with exact numbers at Google Spreadsheets: https://docs.google.com/open?id=0B1Umejl6sE1raW9iRkpDSXNyckU
You can check our demo code at GitHub: https://github.com/eleks/canvasPaint
You can play with app, deployed on S3: https://s3.amazonaws.com/rnd-demo/canvasPaint/index.html
Thanks to Yuriy Guts for proposed low-level fixes.
Stay tuned!



11/08/2012

Cloud Solution for Global Team Engagement at Your Fingertips

Effective management of global teams is one of the most common issues for Localization services providers. Many LSPs are looking for ways to minimize their costs and time through automation. Generally, LSPs perform localization services engaging both their in-house staff and sub-contractors who are usually located all over the world.

To address this challenge, ELEKS has developed a cloud-base system that can be used by both subcontractors and in-house teams. This system allows for a full automation of products installation needed for the teams, whereas automation is synchronized with the products localization process. 
 
 
 
The cloud solution has already shown its numerous benefits:
· measurability by projects, resources, hours
· increased efficiency
· decreased deployment and support (in man-hours)
· centralized storage and management system
· easier support and daily backups
· server uptime – 99.8%

The solution was showcased at Localization World Conference in Seattle 2012 by Taras Tovstyak. The presentation included the case study and the tasty features of cloud system. Also there was presented a glimpse into the future of localization –the video of Dynamic Localization in action.



11/07/2012

HTML5 Canvas: performance and optimization

It's no doubt that HTML5 is going to be next big platform for software development. Some people say it could even kill traditional operating systems and all applications in future will be written with HTML5 and JavaScript. Others say HTML5 apps will have their market share, but never replace native applications completely. One of the main reasons is poor JavaScript performance, they say. But wait, browser vendors say they did lots of optimizations and JavaScript is fast as it was never before! Isn't it true?
Well, simple answer is yes... and no. Modern JavaScript engines such as Google's V8 have impressive performance in case you compare them with their predecessors five-ten years ago. Although, their results are not so impressive if you compare them with statically typed languages such as Java or C#. And of course it will be absolutely unfair competition if we compare JavaScript with native code written with C++.
But how one can determine if their application could be written in JavaScript or should they choose native tools?
Recently we had a chance to make such kind of decision. We were working on proposal for tablet application that should include Paint-like control where user can draw images using standard drawing tools like Pencil and Fill. Target platforms were Android, Windows 8 and iOS, so cross-platform development tools had to be taken into consideration. From the very beginning there was a concern that HTML5 canvas could be too slow for such task. We implemented simple demo application to test canvas performance and prove if it is applicable in that case. Leaping ahead, let us point out that we have mixed fillings about gathered results. On the one hand canvas was fast enough on simple functions like pencil drawing due to native implementation of basic drawing methods. On the other hand, when we implemented classic Flood Fill algorithm using Pixel Manipulation API we found that it is too slow for that class of algorithms. During that research we applied set of performance optimizations to our Flood Fill implementation. We measured their effect on several browsers and want to share them with you.

Initial implementation 

Our very first Flood Fill implementation was very simple:

We tested it with 3 desktop browsers running on Core i5 (3.2 GHz) and 3rd generation iPad with iOS 6. We got following results with that implementation:

Surprisingly, IE 10 is even slower than Safari on iPad. Chrome proved that it is still fastest browser in the world.

Optimize pixel manipulation 

Let's take a look at getPixelColor function:
Code looks little bit ugly, so let's cache result of ((y * (img.width * 4)) + (x * 4)) expression (pixel offset) in variable. Also it makes sense to cache img.data reference into another variable. WE also applied similar optimizations to setPixelColor function:

At least code looks more readable. And what about performance?


Impressive, we got 40-50% performance gain on desktop browsers and about 30% on Safari for iOS. IE 10 now has comparable performance to mobile Safari. It seems that Safari's JavaScript compiler already applied some of optimization we did, so effect was less dramatic for it.

Optimize color comparison 

Let's take a look at getPixelColor function again. We mostly use it in if statement to determine if pixel already was filled with new color: getPixelColor(img, cur.x + dx[i], cur.y + dy[i]) != hitColor. As far as you probably know, HTML5 canvas API provide access to individual color components of each pixel. We use this components to get whole color in RGB format, but here we actually don't need to do it. Let's implement special function to compare pixel color with given color:
Here we use standard behavior of || operator: it doesn't execute right part of the expression if left part returns true. This optimization allows us to minimize array reads and arithmetic operations count. Let's take look at its effect:

Almost no effect: 5-6% faster on Chrome and IE and 2-3% slower on FF and Safari. So, problem must be somewhere else. We left this fix in our code because the code is little bit faster in average with it than without.

Temp object for inner loop

As you probably noticed, our code in main flood fill loop looks little bit ugly because of duplicated arithmetic operations:

Let's rewrite it using temp object for new point we work with:
And test effect:
Results are discouraging. It seems that side-effect of such fix is higher garbage collector load and as a result overall slowness of the application. We tried to replace it with two variables for coordinates, defined in outer scope but it didn't help at all. Logical decision is to revert that code, what we actually did.

Visited pixels cache

Let's think again about pixel visiting in Flood Fill algorithm. It is obvious that we should visit each pixel only once. We guarantee such behavior by comparing colors of neighbor pixels with hit pixel color, which must be slow operation. In fact, we can mark pixels as visited and compare colors only if pixel is not visited. Let's do it:
So, what are results? Well, here they go:

Again, absolutely unexpected results: IE 10 is 10% faster with that fix, but other browsers are dramatically slower! Safari is even slower than initial implementation. It is hard to tell what is the main reason of such behavior, but we can suppose that it could be garbage collector. It also makes sense to apply it in case you don't target mobile Safari and want to have maximum performance in worst case (Sorry IE, it is you. As usual).

Conclusions

We tried to make some more optimizations but it didn't help. Worst thing about JavaScript optimizations is that it is hard to predict their effect, mainly because of implementation differences. Remember, there are two basic rules when you optimize JavaScript code:

  • benchmark results after each optimization step 
  • test in each browser you want your application work with 

HTML5 is cool, but still much slower than native platforms. You should think twice before choosing it as a platform for any compute-intensive application. In other words, there will be no pure HTML5 Photoshop for a long time. Probably you can move some calculations to server-side, but sometimes it is not an option.
You can check our demo code at GitHub: https://github.com/eleks/canvasPaint 
You can play with app, deployed on S3: https://s3.amazonaws.com/rnd-demo/canvasPaint/index.html 
Stay tuned!

UPD: Part 2: going deeper!

11/05/2012

How does good web application look like? Part 2: dev/ops point of view.

Last time when we were writing about web applications our main focus was on user perspective. It is time to discuss another dimensions. So, how does good web application look like for developers and operations team?

We prepared short list of most important properties for that guys:

  1. Availability - ability to operate within a declared proportion of time. Usually is defined in a Service Level Agreement (SLA) as a specific number of "nines" (e.g. four nines = 0.9999, hence the system can be unavailable for at most 0.0001 time = one hour per year). Availability of your application is not the only matter of well-written code, but also depends on hardware, network configuration, deployment strategy, operation team proficiency and many other things.
  2. Scalability - ability to serve increasing amount of requests without a need for architectural changes. In case you have scalable application you can simply add more hardware into your cluster and server more and more clients. In case you host your application in a cloud you even can dynamically scale it up and down making your application incredibly cost-efficient. 
  3. Fault tolerance - ability to operate in case of some unpreventable failure (usually hardware). Usually it means that system can lose some part of functionality in case of failure, but other parts should be working. Fault tolerance is related to availability and some people claim it to be one of the properties of highly available applications. 
  4. Extensibility - system functionality can be extended without a need for core and/or architectural changes. Usually in this case the system is extended by adding plug-ins and extension modules. Sometimes it could be quite tricky to implement extensibility, especially for SaaS. 
  5. Multitenancy - ability to isolate logical user spaces (e.g. individual or organization) so that the tenants feel like they are the only user/organization in the system. It sounds easy, but could be a challenge on a large scale.
  6. Interoperability - ability to integrate with other systems usually by providing or consuming some kind of API. With comprehensive API your service can leverage full power of developers community. Consuming other services API you can extend your application functionality in an easiest way.
  7. Flexibility - architectural property that describes the ability of the system to evolve and change without a need to perform significant changes in its architecture. Holy Grail of software architecture - it is almost impossible to achieve it for fast growing web applications, but you should always do your best.
  8. Security - ability to prevent information from unauthorized access, use, disclosure, disruption, modification, perusal, inspection, recording or destruction. Another critical property for both enterprise and consumer markets. Nobody wants their data to be available for unauthorized access.
  9. Maintainability - ease of maintenance, error isolation and correction, upgrading and dealing with changing environment. Of course it is better to have system that doesn't require maintenance at all, but in real world even the best systems do require it. You have to provide comprehensive maintenance  toolset to your operations team in order to have your system up and running most of the time.
  10. Measurability - ability to track and record system usage. Usually it is required for analytic purposes and in pay-per-use scenarios. Even if you don't have pay-per-use scenarios it is always better to understand your hardware utilization rate in order to optimize costs.
  11. Configurability - ability to change system look and behavior without need to change anything in its code. Being critical for web products that are installed on premise it is also important for software-as-a-service model. 
  12. Disaster Recovery - ability to recover after significant failures (usually hardware). This usually includes a disaster recovery plan that lists possible failure scenarios and steps the operations team should perform to recover the system from failure.
  13. Cost Efficiency/Predictability - ability to operate efficiently in relation to the cost of operation. Being closely related to measurability this property concentrates on financial effectiveness of web application. 
You have to account lots of things when you're developing your web application. Hope this list would be helpful for you. Stay tuned!

11/02/2012

New ELEKS web-site

Several facts about our new web-site:

Making of:

10/08/2012

DevTalks #4 presentations

Materials from our internal DevTalks event (October 4, 2012).
1. Tiny Google projects (by Ostap Andrusiv)


2. Amazon Web Services crash course: exploring capabilities of the Cloud (by Yuriy Guts)


9/21/2012

How does good web application look like? Part 1: user point of view.


People often say that software is "good" when they are satisfied with it. But it is tricky to describe what are objective criteria for "good" software. Moreover, it may mean absolutely different things for various people. Recently we've asked ourselves how do we see ideal web application. What characteristics should it have? What is important and what isn't?
We prepared short list of characteristics that we think should be here for almost any web application that pretends to be called "good". We divided this list into three parts. Today we would like to share first part of this list - how does application look like from user point of view.

So, what is important for users?

  1. Usability. By usability people usually mean ease of use and learnability for user. If you have poor usability in your enterprise system you have to spend lots of time and money to train your users and even it may not help. For consumer market poor usability usually means you lost lots of users just because they don't understand how to use your system. 
  2. Globalization. Ability to adapt to language and cultural specifics of various target markets. You don't have to globalize your application if you target local market. But in case you have big plans for total world domination you have to take care about this. 
  3. Multiplatform Optimization. Ability to operate efficiently on various platforms and browsers (including mobile). Ten years ago it was pretty easy to make your application available for almost any user. Simply because almost all of them were using Internet Explorer on their desktop PCs running Windows. Nowadays this is no longer the truth - you have to deal with various browsers on various operating systems running on various hardware. And don't forget about native mobile clients for your service!
  4. Customizability. Ability to customize system UI and behavior according to particular user needs. Little bit controversial characteristic for web application. There are some people who say that you don't have to allow users to customize your application, because users don't actually know how it should look and behave like. But sometimes it is important to provide this sort of freedom to them. 
  5. Responsiveness. Ability to respond to user interaction within a given time period. Nobody likes application that freezes after some action and don't show any progress to a user. User should be able to interact with application even if it is performing some complex long running task at the moment. Sometimes it makes sense to change the way your application interact with user completely in order to make it more responsive. 
  6. Bandwidth Awareness. Efficient usage of network channel between system and user. It is critical for big content-delivery services like YouTube. For those users who have fast connection it should show video with HD resolution, but for users that are connected through slow and expensive 3G it should consider to show video with less amount of traffic. 
  7. Search Engine Friendliness. Ease of index by search engines, conformity with SEO best practices. If you want your service to be indexed by search engines you have to think about it from the very beginning. Correct usage of things like human-readable URLs, semantic markup and permalinks can be the differentiator for your service that puts it into first place on Google Search Result page. 
  8. Social Media Integration. Integration with social networks: sharing, friends' recommendations, etc. Social networks changed Web dramatically during last couple of years. If you deal with consumer market you should integrate at least with biggest social networks.
  9. Native web stack. Ability to run on browser without need to install any additional plug-ins. Users don't want to install any additional software on their devices to run your application. Sometimes they even can't, because of platform limitations (think iOS, for instance). So, you have to rely only on native web stack. Thankfully, HTML5 adoption is growing so fast.
  10. Accessibility. Ease of use for users with disabilities or specific needs. You should think about accessibility if you want to make your service available for every single person. You must to do it if you do, for instance, some e-government solution and accessibility is strict requirement for your system.

Next time we will tell you about things application developers and operation team are aware of. Stay tuned!

9/18/2012

Event Store was launched yesterday


We are proud to present you Event Store - an awesome, rock-solid, super-fast persistence engine for event-sourced applications. It was launched yesterday in London by Greg Young and his team from ELEKS. You can see video of this presentation on Skills Matter web-site.
Check out more details on Event Store web-site and GitHub.