Reading Time: 7 minutes

Introduction

When you build an native application your written source code gets compiled down to a language that can be understood by the target processor you’re developing for (eg. ARM), this is referred to as machine code or assembly. Decompiling such applications are not possible as there is information that gets lost during the compilation process which can not be recovered to produce the original written code. There are ways to decipher and make sense of such code such as disassembly however, that is not the focus of this blog post and I will cover that in another write up.

Disclaimer:  This post is written in good faith for educational purposes only. I’m not responsible for the misuse of the information provided here along with the use of the tools and methods mentioned. I also do not assume responsibility for the actions carried out with the knowledge provided. Please be aware of End-user license agreements before attempting to decompile a third party’s software or application code.

.NET Platform

Working with .NET we have the luxury to recover a lot of what was originally written. The compilation process is far more involved utilizing a runtime for execution, and the output result is referred to as “managed code”.

Below is a diagram I’ve created to display the process starting from your written source code all the way to where the target architecture executes the compiled binary. The power of .NET comes from its ability to run on many different architectures similar to Java. It also takes the burden of memory management and type safety away from the developer allowing the focus to be on the accomplishment of their specific task or problem.

For this example I’m going to use the C# programming language but this also applies to all other high-level languages supported by .NET.
  • Firstly, your written source code gets compiled by the C# Compiler generating Common Intermediate Language (CIL) formally known as IL. The compiler also generates metadata on your application that enables seamless integration of components. This methodology allows .NET languages to describe themselves automatically in a language-neutral manner, unseen by both the developer and the user. At this stage the CIL is able to be decompiled back into a state somewhat resembling what was originally written. Later on I will talk more on the tools available to do so.
  • The second half of the process happens at runtime. The Common Language Runtime (CLR) is responsible for this, taking care of tasks such as automatic memory management and employing security mechanism such as Code Access Security (CAS). It creates the environment that allows the JIT compiler to take the CIL produced and transform that into machine code targeted at the current running architecture. Essentially the compiled binaries that the .NET platform creates has the capability to run on different physical architectures and operating systems. Examples of these architectures are (x86, x86_64, ARM, ARM64).

Decompilation

There are various programs out there that let you decompile .NET assemblies which are both commercial and non-commercial. We will be using one that I’ve found which provides the best output and has powerful capabilities such as debugging.

dnSpy

dnSpy is a .NET debugger and assembly editor which lets you modify and decompile assemblies. You can find the latest version via the official GitHub repo.

Unlike other CLI decompilers dnSpy provides us with a nice user interface that mirrors that of Visual Studio.

Lets start with a simple .NET Console Application as example.

The above application will ask you to enter two passphrases. If the expected inputs are correct it will execute the function ProcessRequest() The passphrases for this example are dotnetdecompile and cihansol.com and are checked using a simple CRC32 hash calculation.

For demonstration I’ve compiled the app targeting .NET Core 3.1. The output directory contains various files, the one we are interested in is Sample Application.dll You may be wondering what the exe file is for. This gets generated because the platform targeted in this demo is Windows. This file is just a shim executable that serves the purpose to load our compiled CIL code inside of the DLL binary.

Decompiling

To decompile the compiled application with dnSpy it is as simple as dragging and dropping the file into the Graphical User Interface (GUI). Once loaded you can now begin to explore, analyze and debug the Portable Executable (PE) file that contains our application code. You can see that it automatically generates the correct C# source code with the same variable names and function names intact. There is a dropdown option (IL) which shows you the actual CIL code along with an option to output as Visual Basic if desired. There is other functionality provided as well which includes being able to explore/extract resources. For example, if you have an old legacy application which contains an embedded database resource file that you require because you have lost the source code or repository you can extract and recover that with dnSpy.

It’s also worth pointing out that there is a difference in the source code from an optimization point of view. Comparing the two might be useful if you are looking to see what the compiler has done with the code you have written and may in some cases help you write more efficient code.

Our written source code
dnSpy’s generated source code

Modifying

Firstly the input is captured for passphrase A and B. This is then passed through the check function CheckInputPP(). This function will then calculate the CRC32 hash for the input string comparing it to one of the hardcoded hashes keyAHash,keyBHash.  If the newly calculated passphrase hashes match the stored ones ProcessRequest() will be called to indicate we have passed this check. This is a very basic authentication example but I am sure you could think of more advance scenarios.

Above is a graph view of the application’s control flow along with the IL instructions generated by the compiler.

Now if we are in a situation where we don’t know the Passphrases the sample application is asking for so we can go in and modify the behavior to circumvent this check. This can be done by editing the method CheckInputPP() by right clicking and selecting Edit Method (C#).

We then can just return true from that function by editing the source code. This way there will be no hash check preformed and the result will always be true.

Once done you can hit Compile and save the edited assembly by going to File -> Save Module

A popup window will show with various modification options. What we want to do is make sure the Preserve Heap Offsets checkboxes are ticked and go ahead and press OK.

Now when we go to run the application you can see that no matter our input we get the desired outcome effectively bypassing that check.

This was a very basic text book example of modification. There can be more steps involved when dealing with larger code bases and sophisticated libraries but the general procedure is the same.

Protection

So far I’ve shown you how to decompile and modify a built .NET Core application. You might now be wondering, what is stopping someone from decompiling my application potentially comprising it’s security or stealing my source code? Firstly, it comes down to following best practice approaches like not storing connection strings, passwords, secret keys for sensitive resources in your application. That being said there are also other solutions such as obfuscating your code so it is harder for someone nefarious to dissect and decipher what is going on.

To briefly demonstrate this I will be using an obfuscator on our sample application called ConfuserEx. It has various protection mechanisms that you can enable once you load your .NET assembly, these including anti-debug and variable name scrambling.

Here is a screenshot of the same sample application we loaded previously into dnSpy however, this time it has been processed through ConfuserEx. You can observe that all the variables and functions have been obfuscated and jumbled making it very hard to figure out what is going on.

Code obfuscation is a fairly extensive topic. I will dive deeper into this subject in future blog posts explaining the implications in regards to performance, along with when not to use it challenging yourself into thinking. Is this the right approach?

Conclusion

Hopefully by now you have a better understanding of how the .NET platform works using the CLR to run your code cross platform and the disadvantages that brings in terms of code security. I’ve shown you how easy it can be to decompile and modify a built .NET application along with some solutions for code obscurity. The process I have discussed in this write up is significant to developers who want a better understanding of what is involved with creating applications on the .NET platform. Hence, this highlights the importance of keeping sensitive information out of client applications or utilize tools such as Azure KeyVault to protect keys & secrets.

For my next blog post I will walk through unpacking Xamarin mobile apps for code review and the process involved in extracting the compiled .NET assemblies.

Resources & Further Reading

Published by cihansol

Software Developer