Saturday, November 12, 2016

How to integrate native code in your Java and C# applications and don't die in the attempt

You read the title. Integrating native code in your C# and Java apps seems easy, very easy at first right? In C# you just declare your externals, hold on to IntPtr pointers, pass them to your native code when you want something done and that's it.

In Java, you create a Java module with all of your externals, compile the module, use javah to produce a C header file, and then fill in your .c JNI file.

Well, the easiness ends there.

There are four major issues that need to be addressed, in both languages, if you fail to do so very likely your native code integration with C# and Java will be unwieldy to use, buggy (crashy!)( or both.

The four key considerations to be discussed:
  1. How to move data in/out the native library. There are tons of articles covering this topic for C# and Java, and in general terms it ends up being a matter of personal choice the final approach used, so I will not touch this topic on this post.
  2. Lifecycle of unmanaged native objects.
  3. Lifecycle of managed wrapper objects.
  4. Native objects dependencies.

Lifecycle of unmanaged native objects

How to manage the lifecycle of native objects? That's a question you face as soon as you start planning how to integrate native unmanaged code in your nice managed language. 

Why is this an issue?

Well, assuming you wrote your unmanaged library in a language with explicit memory management of some kind, now you have to bridge the gap to languages that are garbage collected, and I tell you right now: not C# nor Java GC play nice with unmanaged memory if you are not careful, and .NET is worse in fact, because how nice its GC works...

Getting back to our topic at hand. What are the typical strategies to manage the lifecycle of unmanaged objects from the perspective of our managed wrappers?

There are two common strategies:
  1. Make your wrapper objects implement IDisposable in C# or Closeable and/or AutoCloseable in Java.
  2. User finalizers to destroy your managed objects.
Both approaches work, and you may end up using one or the other depending on the nature of the underlying unmanaged object.
If your unmanaged object is "pure memory", I suggest you use a finalizer to get rid of the unmanaged object. Why? Because the resulting wrapper code will feel more "natural" to your users. I'm sorry if you read all over the place not to use finalizers, they are available for a reason, just don't abuse them.

IDisposable or Closeable force the developer to "do something else" besides creating an object.
Let's remember in the world of GC languages developers don't want to think when they need to release resources or memory. The less we force them to think about this, the better.

If your unmanaged object holds some resource, such as a file handle, socket, database connection, etc. I strongly recommend using IDisposable, Closeable or AutoCloseable. Nowadays developers know they should look if the objects they are creating implement these type of interfaces, and they know how to deal with them:

Use "using" clause in C# or manually call Dispose() method at the end of life of the object. 
In Java close the close() method or wrap it's using in try ( ) { } block.

All good here, and things seem simple, right?

Not so quick.

Lifecycle of managed object wrappers

We have our unmanaged objects wrapped into managed objects.
We decided to user IDisposable or Closeable pattern on some cases, and in others we decided to get rid of our unmanaged object using finalizers (sinful developers!).
All good here. Now, here is exactly where the fun begins, because .NET and Java GC implementations make no guarantees on the lifecycle of the wrapper object when you make calls to unmanaged code. Java is an exception if you pass to the JNI function the reference to the object itself, but that's not usually done. The most common implementations pass a "handle" (IntPtr or long) to the unmanaged code. 

That's exactly where the problems begin. It's not uncommon at all, trust me on this, that the wrapper object becomes unreachable when calling the unmanaged code and the program makes the object that is just calling unmanaged code eligible for garbage collection. The outcome of this is typically not pretty: segfaults, access violations and crashes. Typically very hard to debug.
Imagine, you are making an unmanaged call, and *while* the unmanaged code is running the wrapper is destroyed. In the case your wrapper properly cleans up when the finalizer is called by the GC, you are out of luck, the underlying unmanaged memory will be freed and the object destroyed while you are doing something with that memory on a different thread. Ouch!

In the case of classes that implement IDisposable, Closeable or AutoCloseable interfaces there's no problem as long as the proper pattern to ensure the disposal procedure is called, because the wrapper object lifecycle ends when calling Dispose() or close() methods, so you are guaranteed to have a live object on all unmanaged code calls.

The problem with these patterns though is that developers oftentimes overlook the fact a class they are using implements any of these interfaces and let the GC handle finalization of the object, which could lead to the object being destroyed early.

Visual Studio compiler will not warn the developer that IDisposable pattern is not used properly unless Code Analysis is enabled on the solution, and Java compiler will not warn the developer either if close() or try() { } pattern not used. I tried running code analysis with JetBrain's IDEA and could not found an inspection for this situation either.

Now, the pervasiveness of this "early destruction" issue depends on many factors. In the case of .NET, I have not found that applications running in Debug mode show symptoms of objects destroyed while unmanaged calls are under way, but in contrast, it's well documented that when the application is compiled in Release mode this behavior is common. GC tries to be more effective and efficient in Release mode? I don't know the reason, but I can confirm it happens and it's very prevalent.

If you are curious, check out this tests here: https://github.com/jsbattig/csharp-gc-helper/blob/master/gc-helper-tests/tests/UnmanagedObjectLifecycleTests.cs#L91-L171 to verify this issue.

Java is supposed to be prone to the same problem, but I have not attempted to verify the behavior yet.

Solving the issue of early disposal of managed wrappers

How to solve this issue?

.NET 

.NET provides a clear path to solve this issue. It requires the developer writing the wrapper to be disciplined stating every time unmanaged calls are made that may rely on ensuring a wrapper object is alive when the life of the wrapper object *may* terminate (it may terminate actually even later). 
How do you do this? By using GC.KeepAlive(object) method.

At first sight, GC.KeepAlive() is counterintuitive. When calling this method .NET is not somehow keeping the method alive *when* the method is called, which is the first thing you may thing the method does by its name. .NET will keep the object *reachable* and therefore alive until the point where the method is called.

So, what a developer must do is make sure there's a call to GC.KeepAlive(TheObject) *after* the call to the unmanaged function making sure this call to GC.KeepAlive() is reachable from the perspective of the compiler.
When the compiler sees this KeepAlive() call, it will not mark the object as unreachable thus garbage collectible.

Java

In Java the story is a bit different. Java doesn't have such as thing as GC.KeepAlive(), but Java compiler will guarantee that an object is alive if it was passed as a parameter to a JNI function. As long as we pass entire objects to our JNI functions, in Java we are good.
The problem is that oftentimes JNI libraries are not written like that. It's easier to pass primitive types representing our handles to the JNI code. A typical datatype used to hold these "handles" is the long datatype, which can fit a 64 or 32 bits pointer. In Java, if you pass this handle to the JNI function by accessing a long field, then you lose the guarantee of the object not being collected while the JNI code is running.

This is the scenario where Java GC behaves exactly like .NET GC (but I have to recognize by my experiments that Java GC is not nearly as aggressive as .NET GC). The problem with Java is that it doesn't provide a GC.KeepAlive() equivalent to ensure the lifecycle of our object while the JNI call is being executed. 
There are many strategies you can try by "doing something" that keeps the object referenced. In most cases, these strategies may not work at all because of two things:

  1. Compiler optimization (the compiler may completely remove the code if found inconsequential).
  2. Instruction reordering (the code you wrote to keep your object referenced was moved on top of the JNI call).
So, how do you solve this problem in Java?

Well, if you want to learn more about the details, and different strategies, read this excellent article by Jason Greene. This is from where I extracted the following piece of code that implements a .NET equivalent of KeepAlive() that you can place in a base class for all of your wrappers:



If you want the full explanation why that code works, and why it's more desirable than a plain write to a volatile field in your class, read the article linked above.

So, we solved the problem of early disposal in .NET and Java, that's it?

No, that's not it.

Native objects dependencies

The final problem I wanted to discuss, is how to handle native objects dependencies. 
If you created a wrapper for a library for a model that has even just simple hierarchical structure between classes, you will face the problem of how to handle dependencies (parent-child relationships) lifecycles. 
What I mean by this?

Well, let's start with the basic premise that on garbage collected languages such as Java and .NET, the order in which finalizers are called is completely unpredictable. What's the consequence of this?

If you have an object A, that has as a parent object B. And let's say that object A destruction code requires that object B is alive in order to perform some cleanup. If you decided not to use IDisposable, Closable or AutoClosable or if you simply forgot to use the proper primitives to ensure calls to Dispose() or close(), then you are out of luck.

When both A and B become unreachable, there's no guarantees on the order of the finalizers calls.

Think about C# pattern to write Dispose() method. If the Dispose() method was not explicitly called, meaning it was called by the finalizer method, you should not touch any managed object you own. Why? Because they may or may not have been finalized before the class in question has been finalized. That simple. Even though this class has references to other managed objects, the class can't assume those objects are still alive... 

What I ended up doing to handle unmanaged object parent/child dependencies is to write a small library in C# and ported it to Java that works as a GC helper. This library allows to register unmanaged object handles, destruction delegates and parents.

The key points I decide to hit with this library are:


  1. The library is thread safe for registration, unregistrations, adding and removing dependencies.
  2. Objects are reference counted.
  3. Multiple registrations of the same object are permitted. This happens on situations where the same underlying object is returned by an unmanaged function call while multiple wrappers may be created.
  4. An object can have multiple parents (dependencies). When registering a parent, its reference count is atomically increased by 1 and when an object is destroyed, all its parents have their reference count decreased by 1 immediately after the object destructor delegate has been called.
  5. Destructor delegates are called from a separate "agent" thread to avoid the potential performance impact on the GC collector thread and to reduce the risk of crashing the GC thread itself if something goes wrong when calling these delegates.
  6. The library is entirely written in a lock-free approach.
  7. To further remove contention, an entry in the core collection is represented by a Pair<ClassType, HandleType>. This helps on situations where the same handle value is provided for objects of different class, which is possible when the wrapper doesn't manage the lifecycle of the underlying object but rather only carries a handle to it, the object can be destroyed at some point, the wrapper is on queue to be removed yet a new handle to another object of a different class is provided.
These are the two repos that implement this library:

https://github.com/jsbattig/java-gc-helper
https://github.com/jsbattig/csharp-gc-helper

Summary

If you are going to wrap an unmanaged/native library in C# or Java, hit the following points, and you can't go wrong:
  1. Decide which classes will be implemented as:
    1. Plain and simple managed objects that rely on regular GC plus a finalizer to trigger underlying object destruction.
    2. IDisposable, Closeable or AutoCloseable objects that rely on the developer to implement the proper disposal pattern.
  2. Use GC.KeepAlive() or an analogous in the case of Java in order to ensure your wrapper objects (and the underlying native object) are not destroyed while making an unmanaged call.
  3. Decided on a strategy to manage unmanaged object dependencies (parent/child relationships). Remember that you can't rely on GC finalizer call order.

Best.