C# internals: foreach statement

After a while, we get back to series dedicated to the internals of C# language. Personally, I think that previous part about iterators was quite longish and complicated, hence today I decided to choose a much simpler (but hopefully still interesting) subject- foreach statement.

 

Foreach only with IEnumerable?

Before we jump into more internal-ish stuff, I’d like to cover one thing which is quite interesting. When we go to the MSDN page about foreach statement, the very first paragraph says the following:

„The foreach statement repeats a group of embedded statements for each element in an array or an object collection that implements the System.Collections.IEnumerable or System.Collections.Generic.IEnumerable<T> interface.”

So, following the official docs, it shouldn’t be possible to iterate through the type which does not implement IEnumerable interface. Let’s see whether it’s true starting with this piece of code:

 


    class Program
    {
        static void Main(string[] args)
        {

            foreach (var element in new MyList())
            {
                Console.WriteLine(element);
            }
        }
    }

    class MyList
    {
    }

 

The above code will not compile, but why is that?

 

 

Well, the error clearly informs that MyList doesn’t have a declaration of GetEnumerator method, but there’s nothing about IEnumerable interface. Let’s modify the code a little bit:

 


    class Program
    {
        static void Main(string[] args)
        {

            foreach (var element in new MyList())
            {
                Console.WriteLine(element);
            }
        }
    }

    class MyList
    {
        public MyList GetEnumerator() => this;
    }

 

The only thing I added was required GetEnumerator method which returns this. I did only because I wanted to avoid creating another class for the enumerator (the same technique used in iterator’s state machine 😉 ). Let’s see the compiler message this time:

 

 

This time enumerator (so MyList type) misses MoveNext method and Current property, but still nothing related to IEnumerable. Let’s modify the code one, last time:

 


    class Program
    {
        static void Main(string[] args)
        {

            foreach (var element in new MyList())
            {
                Console.WriteLine(element);
            }
        }
    }

    public class MyList
    {
        public MyList GetEnumerator() => this;

        public int Current => 2;

        public bool MoveNext() => true;
    }

 

Believe or not this version compiles and runs!

 

 

Of course, the program will never stop due to this stupid implementation but it doesn’t matter. What’s important is the fact that it’s not required for a type to implement IEnumerable interface (both generic and non-generic) to use it inside foreach statement. It’s worth to mention that very similar situation occurs when you want to implement custom awaiter in C#. There is no explicit restriction regarding the implementation of specific interfaces. If the type has the appropriate methods and properties, then it’s valid awaiter.

 

Foreach statement on IL level

All right, I believe we can now see how foreach statement is compiled to IL because <big_surprise> … it’s nothing more but syntactic sugar. To it find out we need to visit Roslyn’s GitHub and move to LocalRewriter_ForEachStatement.cs file. The first method we can find there is called VisitForEachStatement and it contains very interesting code:

 


           if (nodeExpressionType.Kind == SymbolKind.ArrayType)
           {
                ArrayTypeSymbol arrayType = (ArrayTypeSymbol)nodeExpressionType;
                if (arrayType.IsSZArray)
                {
                    return RewriteSingleDimensionalArrayForEachStatement(node);
                }
                else
                {
                    return RewriteMultiDimensionalArrayForEachStatement(node);
                }
            }
            else if (CanRewriteForEachAsFor(node.Syntax, nodeExpressionType, out var indexerGet, out var lengthGetter))
            {
                return RewriteForEachStatementAsFor(node, indexerGet, lengthGetter);
            }
            else
            {
                return RewriteEnumeratorForEachStatement(node);
            }

 

It seems that foreach can be rewritten differently depending on the collection we want to iterate through. As you can see, there are four possible transformations:

  • for one dimensional array
  • for multi-dimensional array
  • for an enumerator (so I believe types that contain GetEnumerator method but not necessarily implement IEnumerable interface)
  • for types for which it’s possible to iterate through using for loop

Let’s say how each of them looks like (according to Roslyn’s GitHub).

 

One dimensional array

 

foreach (var element in new[] { 1,2,3 })
{
    //body
}

//becomes:

int[] a = new int[] { 1, 2, 3, };
for (int p = 0; p < a.Length; p = p + 1)
{
    int current = a[p];
    // body
}

 

Multi-dimensional array

 

foreach (var element in new[,] { { 1, 2 }, { 3, 4 } })
{
    //body
}

//becomes:

int[,] a = new int[2,2] { { 1, 2 }, { 3, 4 } };

int q_0 = a.GetUpperBound(0), q_1 = a.GetUpperBound(1);

for (int p_0 = a.GetLowerBound(0); p_0 <= q_0; p_0 = p_0 + 1) 
{
    for (int p_1 = a.GetLowerBound(1); p_1 <= q_1; p_1 = p_1 + 1)
    {
        int current = a[p_0, p_1];
        // body
    }
}


 

Enumerator

 

foreach (var element in new List<int> { 1,2,3 })
{
    //body
}

//becomes:

Enumerator e = new List<int> { 1, 2, 3 }.GetEnumerator();

try
{
    while(e.MoveNext())
    {
        int current = e.Current;
        //body
    }
}
finally
{
    e.Dispose();
}

 

Other types

Before presenting the actual transformation it’s worth to know which types are included in this group. This can be found inside CanRewriteForEachAsFor method (still inside LocalRewriter_ForEachStatement.cs file). The types are:

  • String
  • Span<T>
  • ReadOnlySpan<T>

 


foreach (var element in "Text")
{
    //body
}

//becomes:

string a = "Text";

for(int p = 0; p < a.Length; p = p +1)
{
    char current = a[p];
    //body
}


 

What is kinda disappointing is the fact that I wasn’t able to see neither enumerator nor „other types” transformations in dotpeek. After decompiling the code I still saw foreach statement even though the IL code clearly presented try-finally and so forth. Below you can see that:

 

 

I don’t know whether it was my fault or not. If so, please let me know down in the comments what should I do in order to make it work as expected.

 

Scope of the foreach variable

All the above transformations have something in common – the „current” value is first assigned to a local variable. There are two, quite interesting things related to that.

The first „fun fact” is that it wasn’t always that ways. Before introducing C# 5.0 the variable was declared before the actual loop like so:

 


Enumerator e = new List<int> { 1, 2, 3 }.GetEnumerator();

try
{
    int current;

    while (e.MoveNext())
    {
        current = e.Current;
        //body
    }
}
finally
{
    e.Dispose();
}


 

The difference doesn’t seem significant, however, it implied an issue when foreach statement was used in the multi-threaded scenario. The most popular example is the following one:

 


foreach(var element in new[] { 1,2,3 })
{
    Task.Run(() => Console.WriteLine(element));
}

 

What’s the issue then? Well, in a nutshell, the result output of the above program was unpredictable due to the race condition/hazard of the particular worker threads. To understand this topic it’s crucial to know what are Closures in C# and how they capture local variables but… this is the subject of the next part of this series. Therefore will get back to this code in the future 😉 Moving back to the story, the problem with racing conditions could be easily fixed by assigning „current value” to a local variable inside the loop. But there was one small problem… you had to know about it. Therefore, the design team decided to do a breaking change in C# 5.0 and fix it on the language level.
The second thing related to the local scope of the foreach variable is also quite interesting. It’s read-only, so the below code does not even compile:

 

foreach(var element in new[] { 1,2,3 })
{
    element = 1;
}

 

Believe or not but I’ve found it out just a few days ago (which I believe is a good thing in this particular case). At first, this seems weird because there’s no obvious explanation for this restriction. I’d expect the program to print „1” three times. This seems quite logical, right?
Well, the MSDN says the following:

„The foreach statement is used to iterate through the collection to get the information that you want, but can not be used to add or remove items from the source collection to avoid unpredictable side effects. If you need to add or remove items from the source collection, use a for loop.”

I think this situation is analogical to the one with race conditions. If you know exactly how foreach statement is compiled to IL then you’ll not be surprised that „current value” is just a local copy, so mutating its state will not affect the actual element in the collection. No matter whether it’s a value or reference type. But once again there’s one, small problem here… you need to know it. Otherwise, the result could be unpredictable. I believe this is the reason for this restriction.

You may also like...

  • Pingback: dotnetomaniak.pl()

  • Arek Bal

    Let me just mention (as a supplement to your fine article) that this test succeeds:

    var list0 = new List(Enumerable.Range(0, 9999999));
    var list1 = new List(Enumerable.Range(0, 9999999));

    IEnumerable enumerable = list1;

    Stopwatch stopwatch = new Stopwatch();

    stopwatch.Start();

    int x = 0;

    foreach (var item in list0)
    {
    unchecked
    {
    x += item;
    }
    }

    stopwatch.Stop();

    var listElapsed = stopwatch.ElapsedTicks;

    stopwatch.Reset();

    stopwatch.Start();

    int y = 0;
    foreach (var item in enumerable)
    {
    unchecked
    {
    y += item;
    }
    }

    stopwatch.Stop();

    Assert.IsTrue(listElapsed < stopwatch.ElapsedTicks * 3L / 4L);

  • Pingback: Cialis 5mg prix()

  • Pingback: Viagra 5mg()

  • Pingback: Cialis generika()

  • Pingback: Viagra generico()

  • Pingback: Cialis 20 mg()

  • Pingback: Buy generic cialis()

  • Pingback: Viagra daily()

  • Pingback: Cialis daily()