Assembly Neutral Types Implementation

Last time we spoke about assembly neutral interfaces and what they were. This post will describe how we implemented it in the ASP.NET vNext stack.

Today, types in the CLR must have an assembly name. That's an intrinsic piece of metadata that cannot be removed without CLR changes. While brainstorming on what could be done with the tools we had today, my manager @mmontwil had a genius idea. Using Roslyn, we could discover all types with an assembly neutral attribute, generate an assembly for each of those types, then stash that definition away somewhere to be loaded later at runtime (@loudej then decided that we should store them as embedded resources).

Using the power of Roslyn, this is what we implemented:

  1. Discover all assembly neutral types via an attribute [AssemblyNeutral].
  2. Compile each of those types (in the right order of course) into assembies of their own.
  3. Remove those types from the original assembly.
  4. Stash the compiled assemblies as embedded resources in the original assembly.
  5. At runtime, when an assembly is loaded, unpack the embedded assemblies and load them into the app domain. We only load the first one for a given assembly name. This is important because like we said in the previous post, the definition cannot differ for a given name.

The Roslyn API allows you to navigate the symbols of a compilation. Not the syntax tree, but the symbols. This means with a reflection-like API we can easily find all of the types with a specific attribute:

public void FindTypeCompilations(INamespaceOrTypeSymbol symbol)  
{
    if (symbol.IsNamespace)
    {
        foreach (var member in (symbol as INamespaceSymbol).GetMembers())
        {
            FindTypeCompilations(member);
        }
    }
    else
    {
        var typeSymbol = symbol as ITypeSymbol;
        foreach (var attribute in typeSymbol.GetAttributes())
        {
            if (attribute.AttributeClass.Name == "AssemblyNeutralAttribute")
            {
                _typeCompilationContexts.Add(new TypeCompilationContext(this, typeSymbol));
            }
        }
    }
}

Given this list of types, we need to compile them in order. Order is significant because these types can depend on each other. Here's an example:

Assembly - Logger.dll

namespace Logging  
{
  public class ConsoleLoggerFactory : ILoggerFactory
  {
      public ILogger Create(string category)
      {
          return new ConsoleLogger(category);
      }
  }

  public class ConsoleLogger : ILogger
  {
      private readonly string _category;
      public ConsoleLogger(string category)
      {
          _category = category;
      }

      public void Log(LogType logType, string message)
      {
          Console.WriteLine("[{0}]: {1} {2}", _category, logType, message);
      }
  }

  [AssemblyNeutral]
  public interface ILoggerFactory
  {
      ILogger Create(string category);
  }

  [AssemblyNeutral]
  public interface ILogger
  {
      void Log(LogType logType, string message);
  }

  [AssemblyNeutral]
  public enum LogType
  {
      Verbose,
      Information,
      Error
  }

  [AssemblyNeutral]
  public class AssemblyNeutralAttribute : Attribute {   }
}

As you can see in the above example, we would create 5 assemblies:

  • Logging.AssemblyNeutralAttrbute.dll

  • Logging.LogType.dll -> Logging.AssemblyNeutralAttrbute.dll

  • Logging.ILoggerFactory.dll -> Logging.ILogger.dll, Logging.AssemblyNeutralAttrbute.dll

  • Logging.ILogger.dll -> Logging.LogType.dll, Logging.AssemblyNeutralAttrbute.dll

  • Logging.dll -> Logging.ILogger.dll, Logging.ILoggerFactor.dll, Logging.LogType.dll, Logging.AssemblyNeutralAttrbute.dll

Notice the AssemblyNeutralAttribute is also AssemblyNeutral.

Order is important because we need to produce the assemblies with the least dependencies before we can produce the ones with the most, a standard topological sort.

But what about this case:

using System;  
using System.Threading.Tasks;

namespace HttpStuff  
{
  [AssemblyNeutral]
  public interface IHttpContext
  {
    IHttpRequest Request { get; }
    IHttpResponse Response { get; }
  }

  [AssemblyNeutral]
  public interface IHttpResponse
  {
    IHttpContext Context { get; } 
    Task Write(byte[] data);
  }

  [AssemblyNeutral]
  public interface IHttpRequest
  {
    string Url { get; }
    IHttpContext Context { get; }
  }

  [AssemblyNeutral]
  public class AssemblyNeutralAttribute : Attribute { }
}

Here we have a circular reference, the IHttpContext points to both the IHttpRequest and IHttpContext and the IHttpResponse and IHttpRequest point to the IHttpContext. What's the right order to create these assemblies? This is something you can't do today using Visual Studio but you can do it with the Roslyn APIs. (yes CLR metadata allows circular references)

Let's say we wanted to achieve this today, what could you do? A two-phase compilation would work since all we need is enough metadata to pass to the compiler as a reference. Doing this manually, we need to:

  1. Compile IHttpRequest and IHttpResponse with empty bodies
  2. Compile IHttpContext against the 2 generated stub dlls for IHttpRequest and IHttpResponse
  3. Re-compile IHttpRequest and IHttpResponse with IHttpContext as a reference.

A manual process looks like this:

IHttpRequest.cs

namespace HttpStuff  
{ 
    public interface IHttpRequest
    {
#if REAL
        string Url { get; }
        IHttpContext Context { get; }
#endif
    }
}

IHttpResponse.cs

namespace HttpStuff  
{
    public interface IHttpResponse
    {
#if REAL
        IHttpContext Context { get; } 
        Task Write(byte[] data);
#endif
    }
}

IHttpContext.cs

namespace HttpStuff  
{
    public interface IHttpContext
    {
        IHttpRequest Request { get; }
        IHttpResponse Response { get; }
    }
}

Generate IHttpContext.dll

csc /target:library IHttpRequest.cs  
csc /target:library IHttpResponse.cs  
csc /target:library /r:IHttpRequest.dll /r:IHttpResponse.dll IHttpContext.cs  

Generate the real IHttpRequest and IHttpResponse

csc /target:library /r:IHttpContext.dll IHttpRequest.cs /define:REAL

csc /target:library /r:IHttpContext.dll IHttpResponse.cs /define:REAL  

This is what we do behind the scenes.

De-duping other assembly neutral references

Since we allow you to re-declare assembly neutral types in your own assembly, you can run into situations where types declared in multiple assemblies need to be seen as the same type:

ConsoleLogger.dll

using System;

namespace Logging  
{
    public class ConsoleLogger : ILogger
    {
        public void Log(string message)
        {
            Console.WriteLine(message);
        }
    }

    [AssemblyNeutral]
    public interface ILogger
    {
        void Log(string message);
    }

    [AssemblyNeutral]
    public class AssemblyNeutralAttribute : Attribute
    {
    }
}

TraceLogger.dll

using System;  
using System.Diagnostics;

namespace Logging  
{
    public class TraceLogger : ILogger
    {
        public void Log(string message)
        {
            Trace.TraceInformation(message);
        }
    }

    [AssemblyNeutral]
    public interface ILogger
    {
        void Log(string message);
    }

    [AssemblyNeutral]
    public class AssemblyNeutralAttribute : Attribute
    { 
    }
}

LoggerFactory.dll depends on both ConsoleLogger.dll and TraceLogger.dll

using System;

namespace Logging  
{
    public static class LoggerFactory
    {
        public static ILogger Create(int type)
        {
            if (type == 0)
            {
                return new ConsoleLogger();
            }

            return new TraceLogger();
        }
    }

    [AssemblyNeutral]
    public interface ILogger
    {
        void Log(string message);
    }
}

This should all compile and work just fine. Even though all 3 assemblies redefined the ILogger interface, they are all seen as equivalent.

What the assembly metadata looks like

Lets look at what this process does to the assembly metadata. Using ILDASM, lets look at a normal type definition:

MyCompany.MyLogger.dll

namespace Logging  
{
    public interface ILogger
    {
        void Log(string message);
    }

    public class Logger : ILogger
    {
        public void Log(string message)
        {
        }
    }
}

Looking at the metadata table for the 2 types we see that they are both type definitions in this assembly, nothing special:

TypeDef #1 (02000002)  
-------------------------------------------------------
    TypDefName: MyCompany.MyLogger.ILogger  (02000002)
    Flags     : [Public] [AutoLayout] [Interface] [Abstract] [AnsiClass]  (000000a1)
    Extends   : 01000000 [TypeRef] 
    Method #1 (06000001) 
    -------------------------------------------------------
        MethodName: Log (06000001)
        Flags     : [Public] [Virtual] [HideBySig] [NewSlot] [Abstract]  (000005c6)
        RVA       : 0x00000000
        ImplFlags : [IL] [Managed]  (00000000)
        CallCnvntn: [DEFAULT]
        hasThis 
        ReturnType: Void
        1 Arguments
            Argument #1:  String
        1 Parameters
            (1) ParamToken : (08000001) Name : message flags: [none] (00000000)


TypeDef #2 (02000003)  
-------------------------------------------------------
    TypDefName: MyCompany.MyLogger.Logger  (02000003)
    Flags     : [Public] [AutoLayout] [Class] [AnsiClass] [BeforeFieldInit]  (00100001)
    Extends   : 01000006 [TypeRef] System.Object
    Method #1 (06000002) 
    -------------------------------------------------------
        MethodName: Log (06000002)
        Flags     : [Public] [Final] [Virtual] [HideBySig] [NewSlot]  (000001e6)
        RVA       : 0x00002050
        ImplFlags : [IL] [Managed]  (00000000)
        CallCnvntn: [DEFAULT]
        hasThis 
        ReturnType: Void
        1 Arguments
            Argument #1:  String
        1 Parameters
            (1) ParamToken : (08000002) Name : message flags: [none] (00000000)

    Method #2 (06000003) 
    -------------------------------------------------------
        MethodName: .ctor (06000003)
        Flags     : [Public] [HideBySig] [ReuseSlot] [SpecialName] [RTSpecialName] [.ctor]  (00001886)
        RVA       : 0x00002053
        ImplFlags : [IL] [Managed]  (00000000)
        CallCnvntn: [DEFAULT]
        hasThis 
        ReturnType: Void
        No arguments.

    InterfaceImpl #1 (09000001)
    -------------------------------------------------------
        Class     : MyCompany.MyLogger.Logger
        Token     : 02000002 [TypeDef] MyCompany.MyLogger.ILogger

Now let's make the ILogger interface [AssemblyNeutral]

TypeDef #1 (02000002)  
-------------------------------------------------------
    TypDefName: MyCompany.MyLogger.Logger  (02000002)
    Flags     : [Public] [AutoLayout] [Class] [AnsiClass] [BeforeFieldInit]  (00100001)
    Extends   : 01000006 [TypeRef] System.Object
    Method #1 (06000001) 
    -------------------------------------------------------
        MethodName: Log (06000001)
        Flags     : [Public] [Final] [Virtual] [HideBySig] [NewSlot]  (000001e6)
        RVA       : 0x00002050
        ImplFlags : [IL] [Managed]  (00000000)
        CallCnvntn: [DEFAULT]
        hasThis 
        ReturnType: Void
        1 Arguments
            Argument #1:  String
        1 Parameters
            (1) ParamToken : (08000001) Name : message flags: [none] (00000000)

    Method #2 (06000002) 
    -------------------------------------------------------
        MethodName: .ctor (06000002)
        Flags     : [Public] [HideBySig] [ReuseSlot] [SpecialName] [RTSpecialName] [.ctor]  (00001886)
        RVA       : 0x00002053
        ImplFlags : [IL] [Managed]  (00000000)
        CallCnvntn: [DEFAULT]
        hasThis 
        ReturnType: Void
        No arguments.

    InterfaceImpl #1 (09000001)
    -------------------------------------------------------
        Class     : MyCompany.MyLogger.Logger
        Token     : 01000007 [TypeRef] MyCompany.MyLogger.ILogger

Notice the InterfaceImpl section has change from a TypeDef to a TypeRef. That's because we told the compiler this type is now coming from an external assembly instead of a type defined in the assembly itself. Roslyn automatically fixes this up for us when we remove the types from the assembly's compilation and replace them with assembly references. This is what makes Roslyn hold a special place in my heart, it's just magical :).

Rules of the road

Assembly neutral types need to follow these rules:

  • They must not reference anything in the original assembly (unless those types are assembly neutral). You can't have an assembly neutral type that references another type in the assembly where it's defined.
  • They can reference other assembly neutral types and types in reference assemblies. This means that the assembly neutral type can access anything the containing assembly is referencing.
  • The types must match exactly! We load the first type with a matching name, if they don't match expect chaos.
  • You can embed any type, not just interfaces but it's best to stay away from anything that contains IL (like method bodies). Stick with interfaces, enums, classes with fields only, not properties.
  • Assembly neutral types can be re-declared in source. This means you don't have to have an assembly reference to use the contract.
  • You may reference a carrier assembly full of assembly neutral types. The compiler will explode them and only embed the ones you use.

Here's what it looks like in reflector:

Assembly Neutral Carrier Assembly

If you're interested in the implementation check it out on github.

comments powered by Disqus