Over the past several months, we’ve been working on a new set of collection types that offer an immutable design. Today we are happy to announce that we are one step closer to a stable version of this work: we’ve just shipped the release candidate (1.0.23 RC) of the Microsoft.Bcl.Immutable NuGet package.
Because we could: an immutable representation of our release timeline
What has changed in RC?
The changes in this update can be summarized into the following areas:
- Removal of
ImmutableArray
. With a heavy heart we’ve decided to removeImmutableArray
for now because we aren’t happy with the current design. - Comparers. We’ve simplified the exposed concepts by not exposing the comparers through the interfaces.
- Consistent construction. In the last updated, we split the factory methods into methods that deal with scalars and ones that deal with other collections. We’ve improved this by renaming some APIs in order apply the split consistently.
- Consistency with
ToString
. Previously, we didn’t overrideToString
consistently. We’ve fixed this by not overridingToString
at all. - Minor improvements. We’ve added a few convenience APIs.
I’ll discuss these areas in more detail.
ImmutableArray
removed for now
After careful consideration we decided to remove ImmutableArray
from our 1.0 release plans. We aren’t happy with the design as it is today. This doesn’t mean we have given up on having an ImmutableArray
– it just means we need more time to do it properly, and we don’t want to block releasing a stable version of immutable collections because we aren’t happy with a single type. After we ship the stable 1.0, we’ll ship a 1.1 beta that will include ImmutableArray
.
Here’s why we weren’t happy with ImmutableArray
:
The primary goal for ImmutableArray
is to be a zero-overhead wrapper around an ordinary array. The current implementation, although it had little overhead, didn’t satisfy the performance needs of one of our most important partners, Project Roslyn, which served as an excellent benchmark for building a large-scale immutable system.
Because of performance requirements, ImmutableArray
needs to be a value type, i.e., a struct. In the CLR, all value types have an implied default state. This state is defined as having an underlying memory of all zeros. For ImmutableArray
that means that the underlying array it wraps is null. In order to provide meaningful semantics, we believe it’s best for an immutable array in its default state to behave exactly like an empty array. That’s consistent with the way the .NET Framework deals with value types in general.
Unfortunately, we discovered that this design causes a general performance issue. The JIT compiler is normally able to eliminate the bounds checks for standard for
loops. For ImmutableArray
, the indexer and Length
property are inlined so one would think the bounds check would be eliminated as well. Unfortunately, that’s not the case, because these properties don’t just forward to the underlying array but also contain an additional null check to provide the behavior I explained. This causes the JIT to no longer be able to eliminate the bounds checks. Our partner Roslyn uses ImmutableArray
throughout its system, so this caused an unacceptable performance loss.
Today, the only way to get the performance we need is to remove the additional null checks from the indexer and Length
property. However, we believe this has significant usability issues.
For example, consider the following code:
int[] ints = new int[] { 1, 2, 3 };
ImmutableArray
ImmutableArray
Since all value types have an automatic default constructor that initializes the value type to its default state, empty will be an ImmutableArray
whose underlying array is null. Thus, the implementation of AddRange
will crash with a NullReferenceException
.
There are other places where this problem manifests itself. Since ImmutableArray
implements certain collection interfaces (such as IEnumerable
and IReadOnlyList
), you can pass it to methods that work against the interface. Since the interface reference will be non-null, consumers won’t expect any methods or property invocations to cause a NullReferenceException
.
Not exactly a happy place. However, we do have some ideas on how to address this issue and decided to give it more baking time by removing ImmutableArray
for now.
Reduced complexity around comparers
After reviewing some of your feedback, we decided that certain APIs that exposed comparers were overkill within the current design. Thus, we removed IHashKeyCollection
and ISortKeyCollection
as well as the ValueComparer
property from IImmutableDictionary
.
Consistent construction
When we introduced factory methods for constructing immutable collections, we also removed all Empty
properties. So instead of:
return ImmutableList
you have to write:
return ImmutableList.Create
We received feedback that this looks quite bad in code reviews, because most reviewers assume that Create
would return a new instance. For added expressiveness, we decided to bring back the Empty
properties, so now you have both options.
In the past update, we introduced the From
factory method so you could differentiate between factory methods that work on scalars (Create
) and and factory methods that operate on collections (From
). We realized that the name From
is an unfortunate choice because it doesn’t show up together in IntelliSense with Create
. In the RC update, we renamed From
to CreateRange
.
Also, our naming wasn’t fully consistent. There are some Create
overloads on ImmutableDictionary
that accepted an IEnumerable
. We renamed those to CreateRange
as well.
Consistency with ToString
ImmutableDictionary
now no longer overrides ToString
(it was the only collection that did that anyway). This means that calling ToString
on any immutable collection will now return the fully qualified type (which is the behavior of Object.ToString
).
For debugging purposes, all collections already make use of DebuggerDisplayAttribute and DebuggerTypeProxyAttribute.
Minor improvements
The current signature of the ToImmutableDictionary
method requires two selectors: one that determines the key and one that determines the value. This is inconsistent with Enumerable.ToDictionary, which provides an overload that allows you to specify only the key. The RC update includes the equivalent overload for ToImmutableDictionary
. Thus, code like this will now compile:
ImmutableList
ImmtuableDictionary
Performing a binary search is a fairly common operation for collections. Since ImmutableList
is using a tree behind the scenes, indexing is O(log n). Thus, providing a BinarySearch
method on ImmutableList
can do a better job because it can traverse the internal tree and avoid having to repeatedly re-locate the node in the loop. For symmetry, we also added the BinarySearch
method to the ImmutableList
.
What has not changed?
We’ve received a lot of feedback on immutable collections. There were two major design points many of you commented on: the lack of constructors and the implementation of mutable interfaces. I’d like to explain these design decisions.
The lack of constructors
Some of you asked us to bring back the constructors. The concern was that this code:
ImmutableList
is, by far, more intuitive than this code:
ImmutableList
That’s absolutely true. When designing APIs, we generally avoid factory methods because they are often an indication of overengineering and generally hurt usability. For example, most developers would agree that XDocument
is easier to use than XmlDocument
. The fact that XDocument
avoids factories is one of the reasons why it’s more usable.
The challenge in designing immutable APIs is that it’s very important to avoid unnecessary allocations. For example, most immutable collections start off with an empty instance. Offering a constructor will inevitably cause significant GC pressure due to the fact that this is the most intuitive way to construct them.
Using factory methods also provide other interesting opportunities for optimization. For example, consider this code:
class NumberProcessor
{
private readonly ImmutableList
NumberProcessor(IEnumerable
{
_numbers = ImmutableList.CreateRange
}
// other methods elided for brevity
}
The goal of this code is to keep the numbers around. Since the input is typed as IEnumerable
, you can pass virtually any collection, including collections that are already immutable. In case numbers
is already an instance of ImmutableList
, the CreateRange
factory method can simply cast it and thus avoid the construction of a new object. A constructor would have forced us to create a brand new immutable list.
There is also the problem of collection initializers. Suppose we provide a default constructor for ImmutableList
. In that case, the following code would compile just fine:
ImmutableList
Console.WriteLine(items.Count);
You would probably expect the output to be 3
but in fact it’s 0
. The reason is that collection initializers follow a pattern. As long as the type has a constructor, implements IEnumerable
, and provides a public Add
method, the compiler lets you use the syntax above. The code that the compiler generates looks like this:
ImmutableList
items.Add(1);
items.Add(2);
items.Add(3);
Console.WriteLine(items.Count);
Now it’s clear why the output is 0
: It’s because collection initializers don’t expect the Add
method to return a new instance and instead rely on side effects to mutate the existing instance. Since the result is never used, the items variable still refers to the original, empty collection.
Thus, we decided to exclusively offer factory methods to ensure that you get the best behavior and cannot accidentally do the wrong thing.
Implementing the mutable interfaces
Several people raised the issue that our immutable collection types implement the mutable interfaces. For example, this is what the declaration of ImmutableList
looks like:
public sealed class ImmutableList
IImmutableList
IEnumerable,
IEnumerable
IReadOnlyCollection
IReadOnlyList
ICollection,
ICollection
IList,
IList
{
...
}
It’s worth pointing out that this declaration does not have a correctness issue – you cannot mutate an instance of ImmutableList
by casting it to a mutable interface. For example, calling IList
would throw NotSupportedException
.
The mutable interfaces always supported a read-only mode, which means that consumers can’t change the contents. This mode is exposed via ICollection.IsReadOnly, which all mutable interfaces inherit from. Thus, the concrete immutable collection types all implement the existing, mutable interfaces in a read-only fashion, which means that IsReadOnly
returns true and all methods that would mutate the collection throw NotSupportedException
.
That’s not fundamentally incompatible with immutable collections; for all intents and purposes, immutable collections provide an even stronger guarantee that nobody, not just the consumer, can change their contents. Some of you suggested that we shouldn’t implement the mutable interfaces at all, so you shouldn’t be able to accidentally pass an immutable collection to a method that takes an IList
. This would compile just fine, but might cause a NotSupportedException
at runtime if the method in question tries to modify the collection through the mutable interface.
Based on the data we have, we found that in many cases where methods take a mutable interface (such as IList
), the consumer only needs to read the contents. We believe that enabling interoperability with mutable collections is very important, because otherwise it would be difficult to adopt immutable collections into existing code.
Also, this interoperability is consistent with how the rest of the .NET Framework is designed. For example, Streams can support reading, writing, and seeking. Instead of providing separate types, we’ve decided to expose the capabilities via the optional features pattern because it is typically much simpler and doesn’t result in an explosion of combinatorial types. The IsReadOnly
property on ICollection
is another instance of this pattern.
Thus, we decided that it’s best to implement the mutable interfaces on the concrete types.
Summary
Today we shipped the release candidate of our Microsoft.Bcl.Immutable NuGet package. We plan to release a stable 1.0 package by the end of September. We’ve removed ImmutableArray
from 1.0 because we weren’t happy with its design. It will re-appear in 1.1, which will be available as a beta release shortly after we ship stable 1.0 package.
Please play around with the RC and let us know what you think!