Why should you "obfuscate" your code

Walter G.

5 minute read

programming

The technical definition of code obfuscation is to transform a working piece of code into a difficult to read and decipher logical equivalent of the original code in order to increase overall security on a web application.

The term itself is not solely related to writing code. You can obfuscate a sentence for example by using difficult to define words and complex word structure. Some people might be able to make out the meaning of the original message, though they would require higher skill overall to do so.

And the same is true for the code equivalent. Some experts might be able to tell what an application does just based on the overall design pattern, but it won't be easy.

Why do we obfuscate code?

There are 2 primary reasons as to why a programmer would want to go through the extra step of distorting their codebase in order to make it hard to read.

The most typical reason, is for improved security. While a programmer's job is to keep a clear separation between client-side code and server-side code, it doesn't always happen. And the primary method of entry to your website is typically through client-side code.

Increase security on the client

Typically that means JavaScript code. The main downside of having a JavaScript heavy application, is that the code is essentially public to anyone willing to right-click and 'inspect' on any part of the page.

Take the following snippet for example:

function loadFeed(){
    // loads article feed


    Service.GetFeed(token,...);
}

While it isn't the most critical code that requires high levels of security, it still gives away information through its context. We know for example that this is where the article feed is coming from. And we know that there is some kind of Service module that is in charge of making web requests. We also know that somehow the token variable has some effect on the request as well.

Again, not the most critical code. And it's quite possibly one of the most common types of operations that you will see implemented in programming on most websites.

Real websites though are huge. Unlike the sample code that is often found online in tutorials, a real high traffic website can contain hundreds to thousands of functions spread across multiple pages. So while one function probably won't give away too much information, hundreds of functions can definitely help to paint a better picture to any would be intruder.

By obfuscating any client-facing code, you are adding one extra layer of security similar to hashing a password or encrypting some piece of text.

Optimization

The second most popular reason to obfuscate your codebase, is in order to decrease the overall file size of the code. At least, most of the time. We'll discuss how the opposite may be true further down below. By replacing long and detailed function and variable names with single or double character identifiers, you can reduce the overall size drastically.

This is mainly relevant on client-side code as server-side code is typically compiled and there is no real improvement in compilation given shorter function names. And while it is possible for your compiled code to be decompiled, that is a topic for a future article and outside the scope of code obfuscation.

On the client though, a 100kb file could easily be trimmed down to at least half of that. Take that same code snippet shown above for example. A very quick obfuscation of that code could resemble something like the following:

Before:

function loadFeed(){
    // loads article feed


    Service.GetFeed(token,...);
}

After:

function a(){
    b.c(d,...);
}

Logically speaking, they are both identical functions calling on the same operations. The only difference is in the number of characters in each of the declarations.

Again, in one function the change might go unnoticed. But on a large codebase with dozens to hundreds of files, the site download speeds can definitely be improved.

How to implement

The process of obfuscating your code will mainly depend on the web framework that you are using for your application. If you are running a Node application for example, then you have various packages that can handle this for you.

Before I go into any specific method though, remember that code obfuscation is a pre-production operation. That means that you only perform the obfuscation once you have completed your development and you are ready to ship it to production. Undoubtedly this also means that you already have some form of deployment process that you perform after development.

You don't want to update your obfuscated code in any way typically.

UglifyJS

UglifyJS is an all-in-one JavaScript parser, minifier, compressor and beautifier. It is also one of the most popular libraries to do so with over 14 million weekly downloads.

To install it directly through NPM, you can use the following install command.

npm install uglify-js -g

And once installed, you can process your files with the following command.

uglifyjs [input files] [options]

You can read more about the possible options over at the official NPM page.

UglifyJS gives you a tremendous amount of control over how you choose to transform your codebase. You can run the process through the built-in CLI commands. Or you can dynamically convert your code through the exposed API.

javascript-obfuscator

This is a less popular NPM library, though it has an advantage over UglifyJS and one that makes it a standout candidate.

A standard code compression library typically only renames function and variable names in order to reduce size and to make things slightly more difficult to read. That process however can be easily undone using a beautifier.

javascript-obfuscator makes things more challenging by running your code through various other transformations that do much more than simply renaming objects. Take the following JavaScript for example:

// Paste your JavaScript code here
function hi() {
  console.log("Hello World!");
}
hi();

Passing that code through javascript-obfuscator will yield the following output:

var _0x4c2d=['2982013fizMoR','950936qMZxKo','2466IKnbwr','log','Hello\x20World!','761XUPCoz','476WTPoSZ','1007RutVMl','1715915DcXCFG','1054845qcJZnn','1718158HqkoNl'];(function(_0x3dd739,_0x5ddfca){var _0x546edc=_0x28e8;while(!![]){try{var _0x5df10a=-parseInt(_0x546edc(0x1d0))*parseInt(_0x546edc(0x1c9))+parseInt(_0x546edc(0x1cd))+parseInt(_0x546edc(0x1ca))*-parseInt(_0x546edc(0x1d3))+-parseInt(_0x546edc(0x1cb))+-parseInt(_0x546edc(0x1cc))+parseInt(_0x546edc(0x1cf))+parseInt(_0x546edc(0x1ce));if(_0x5df10a===_0x5ddfca)break;else _0x3dd739['push'](_0x3dd739['shift']());}catch(_0x25f1f3){_0x3dd739['push'](_0x3dd739['shift']());}}}(_0x4c2d,0xe58ac));function hi(){var _0x517449=_0x28e8;console[_0x517449(0x1d1)](_0x517449(0x1d2));}function _0x28e8(_0x3a116a,_0x3ac70d){_0x3a116a=_0x3a116a-0x1c9;var _0x4c2dbc=_0x4c2d[_0x3a116a];return _0x4c2dbc;}hi();

The result is a much more difficult challenge for anyone looking to make sense of your logic. Both sets of code however are identical from a logical perspective. But also note that the code is much larger than its original counterpart. Several times larger to be exact. This means that a 50kB file could easily jump to 200-300KB which could definitely affect overall performance. It is also overall, much slower as well, according to the official documentation.

Which is why I included it on this list, in order to show the potential side-effects of changing your code at such a drastic level.

If your biggest issue is security, then this high level of obfuscation might be a good option. Though a more optimized method might be to simply just keep your code on a secure server where no one in the public domain has access.

One last note on implementation. As a programmer, my first instinct when it came to code post-processing (minifying, bundling and obfuscating) was to code the functionality myself. I do not recommend anyone do that anymore.

Perhaps in the past, when 3rd party libraries where more scarce, this made total sense. But since that time, code structure has grown in complexity and in size and there are far too many unknowns to take into account. Because much of the web these days relies heavily on 3rd party libraries typically injected dynamically at runtime, it is indeed a big challenge to modify naming conventions without risk of something breaking.

While this is typically not a required step for web developer to take when working on a web application, it is nice to know that the possibility of it is there relatively easily if one day you find yourself looking to optimize and secure your code.

Walt is a computer scientist, software engineer, startup founder and previous mentor for a coding bootcamp. He has been creating software for the past 20 years.

Last updated on: May 03 2021