Custom implementation of the linq Zip operator for different length lists
up vote
6
down vote
favorite
Based on my answer I have my implementation of linq Zip operator which operates on different length lists, and loops shortest list.
My implementation:
using System;
using System.Collections.Generic;
using System.Linq;
namespace SO
{
internal class Program
{
public static void Main(string args)
{
List<String> listA = new List<string> {"a", "b", "c", "d", "e", "f", "g"};
List<String> listB = new List<string> {"1", "2", "3"};
var mix = listA.ZipNew(listB, (l1, l2) => new {l1, l2}).SelectMany(x => x);
foreach (var m in mix)
{
Console.WriteLine(m);
}
}
}
public static class Impl
{
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> resultSelector)
{
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
{
var i1 = true;
var i2 = true;
var i1Shorter = false;
var i2Shorter = false;
var firstRun = true;
while(true)
{
i1 = iterator1.MoveNext();
i2 = iterator2.MoveNext();
if (!i1 && (i1Shorter || firstRun))
{
iterator1.Reset();
i1 = iterator1.MoveNext();
i1Shorter = true;
firstRun = false;
}
if (!i2 && (i2Shorter || firstRun))
{
iterator2.Reset();
i2 = iterator2.MoveNext();
i2Shorter = true;
firstRun = false;
}
if (!(i1 && i2))
{
break;
}
yield return resultSelector(iterator1.Current, iterator2.Current);
}
}
}
}
}
And I wonder if this implementation could be improved somehow, what could be improved, for better readability or speed.
c# linq
add a comment |
up vote
6
down vote
favorite
Based on my answer I have my implementation of linq Zip operator which operates on different length lists, and loops shortest list.
My implementation:
using System;
using System.Collections.Generic;
using System.Linq;
namespace SO
{
internal class Program
{
public static void Main(string args)
{
List<String> listA = new List<string> {"a", "b", "c", "d", "e", "f", "g"};
List<String> listB = new List<string> {"1", "2", "3"};
var mix = listA.ZipNew(listB, (l1, l2) => new {l1, l2}).SelectMany(x => x);
foreach (var m in mix)
{
Console.WriteLine(m);
}
}
}
public static class Impl
{
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> resultSelector)
{
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
{
var i1 = true;
var i2 = true;
var i1Shorter = false;
var i2Shorter = false;
var firstRun = true;
while(true)
{
i1 = iterator1.MoveNext();
i2 = iterator2.MoveNext();
if (!i1 && (i1Shorter || firstRun))
{
iterator1.Reset();
i1 = iterator1.MoveNext();
i1Shorter = true;
firstRun = false;
}
if (!i2 && (i2Shorter || firstRun))
{
iterator2.Reset();
i2 = iterator2.MoveNext();
i2Shorter = true;
firstRun = false;
}
if (!(i1 && i2))
{
break;
}
yield return resultSelector(iterator1.Current, iterator2.Current);
}
}
}
}
}
And I wonder if this implementation could be improved somehow, what could be improved, for better readability or speed.
c# linq
2
Your question would benefit from some example usage within the question itself, so that we can see how you intend the method to be consumed, and quickly verify that it is working as intended.
– VisualMelon
16 hours ago
@VisualMelon improved, full working code
– BWA
16 hours ago
1
NeverReset
an enumerator. Just get a new enumerator.Reset
was a misfeature intended for COM interop scenarios.
– Eric Lippert
11 hours ago
add a comment |
up vote
6
down vote
favorite
up vote
6
down vote
favorite
Based on my answer I have my implementation of linq Zip operator which operates on different length lists, and loops shortest list.
My implementation:
using System;
using System.Collections.Generic;
using System.Linq;
namespace SO
{
internal class Program
{
public static void Main(string args)
{
List<String> listA = new List<string> {"a", "b", "c", "d", "e", "f", "g"};
List<String> listB = new List<string> {"1", "2", "3"};
var mix = listA.ZipNew(listB, (l1, l2) => new {l1, l2}).SelectMany(x => x);
foreach (var m in mix)
{
Console.WriteLine(m);
}
}
}
public static class Impl
{
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> resultSelector)
{
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
{
var i1 = true;
var i2 = true;
var i1Shorter = false;
var i2Shorter = false;
var firstRun = true;
while(true)
{
i1 = iterator1.MoveNext();
i2 = iterator2.MoveNext();
if (!i1 && (i1Shorter || firstRun))
{
iterator1.Reset();
i1 = iterator1.MoveNext();
i1Shorter = true;
firstRun = false;
}
if (!i2 && (i2Shorter || firstRun))
{
iterator2.Reset();
i2 = iterator2.MoveNext();
i2Shorter = true;
firstRun = false;
}
if (!(i1 && i2))
{
break;
}
yield return resultSelector(iterator1.Current, iterator2.Current);
}
}
}
}
}
And I wonder if this implementation could be improved somehow, what could be improved, for better readability or speed.
c# linq
Based on my answer I have my implementation of linq Zip operator which operates on different length lists, and loops shortest list.
My implementation:
using System;
using System.Collections.Generic;
using System.Linq;
namespace SO
{
internal class Program
{
public static void Main(string args)
{
List<String> listA = new List<string> {"a", "b", "c", "d", "e", "f", "g"};
List<String> listB = new List<string> {"1", "2", "3"};
var mix = listA.ZipNew(listB, (l1, l2) => new {l1, l2}).SelectMany(x => x);
foreach (var m in mix)
{
Console.WriteLine(m);
}
}
}
public static class Impl
{
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> resultSelector)
{
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
{
var i1 = true;
var i2 = true;
var i1Shorter = false;
var i2Shorter = false;
var firstRun = true;
while(true)
{
i1 = iterator1.MoveNext();
i2 = iterator2.MoveNext();
if (!i1 && (i1Shorter || firstRun))
{
iterator1.Reset();
i1 = iterator1.MoveNext();
i1Shorter = true;
firstRun = false;
}
if (!i2 && (i2Shorter || firstRun))
{
iterator2.Reset();
i2 = iterator2.MoveNext();
i2Shorter = true;
firstRun = false;
}
if (!(i1 && i2))
{
break;
}
yield return resultSelector(iterator1.Current, iterator2.Current);
}
}
}
}
}
And I wonder if this implementation could be improved somehow, what could be improved, for better readability or speed.
c# linq
c# linq
edited 16 hours ago
asked 16 hours ago
BWA
16627
16627
2
Your question would benefit from some example usage within the question itself, so that we can see how you intend the method to be consumed, and quickly verify that it is working as intended.
– VisualMelon
16 hours ago
@VisualMelon improved, full working code
– BWA
16 hours ago
1
NeverReset
an enumerator. Just get a new enumerator.Reset
was a misfeature intended for COM interop scenarios.
– Eric Lippert
11 hours ago
add a comment |
2
Your question would benefit from some example usage within the question itself, so that we can see how you intend the method to be consumed, and quickly verify that it is working as intended.
– VisualMelon
16 hours ago
@VisualMelon improved, full working code
– BWA
16 hours ago
1
NeverReset
an enumerator. Just get a new enumerator.Reset
was a misfeature intended for COM interop scenarios.
– Eric Lippert
11 hours ago
2
2
Your question would benefit from some example usage within the question itself, so that we can see how you intend the method to be consumed, and quickly verify that it is working as intended.
– VisualMelon
16 hours ago
Your question would benefit from some example usage within the question itself, so that we can see how you intend the method to be consumed, and quickly verify that it is working as intended.
– VisualMelon
16 hours ago
@VisualMelon improved, full working code
– BWA
16 hours ago
@VisualMelon improved, full working code
– BWA
16 hours ago
1
1
Never
Reset
an enumerator. Just get a new enumerator. Reset
was a misfeature intended for COM interop scenarios.– Eric Lippert
11 hours ago
Never
Reset
an enumerator. Just get a new enumerator. Reset
was a misfeature intended for COM interop scenarios.– Eric Lippert
11 hours ago
add a comment |
3 Answers
3
active
oldest
votes
up vote
6
down vote
public static class Impl
{
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
Names? The class would be more descriptive as something like LinqExtensions
; the method something like ZipLooped
.
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
{
var i1 = true;
var i2 = true;
var i1Shorter = false;
var i2Shorter = false;
var firstRun = true;
The iterators have useful names, but what does i1
mean? And why five variables to track the state of two iterators? IMO it would be simpler as
var firstEnded = false;
var secondEnded = false;
while (true)
{
if (!iterator1.MoveNext())
{
if (secondEnded) yield break;
firstEnded = true;
iterator1.Reset();
if (!iterator1.MoveNext()) yield break;
}
if (!iterator2.MoveNext())
{
if (firstEnded) yield break;
secondEnded = true;
iterator2.Reset();
if (!iterator2.MoveNext()) yield break;
}
yield return resultSelector(iterator1.Current, iterator2.Current);
}
and the almost repeated code might be worth pulling out as an inner method:
var firstEnded = false;
var secondEnded = false;
bool advance<T>(IEnumerator<T> it, ref bool thisEnded, bool otherEnded)
{
if (it.MoveNext()) return true;
// `it` has done a full cycle; if the other one has too, we've finished
if (otherEnded) return false;
thisEnded = true;
// Start again, although if `it` is empty we need to abort
it.Reset();
return it.MoveNext();
}
while (true)
{
if (!advance(iterator1, ref firstEnded, secondEnded)) yield break;
if (!advance(iterator2, ref secondEnded, firstEnded)) yield break;
yield return resultSelector(iterator1.Current, iterator2.Current);
}
I notice that you've decided to yield break
if either of the enumerables is empty. Would an exception be a better choice?
I'd renameadvance
inTryMoveNextOrLoop
– t3chb0t
14 hours ago
I notice that you've decided to yield break if either of the enumerables is empty. - This is the expected behaviour. I'd be surprised if it was something else and this makes linq so reliable - nothing there so nothing happens - otherwise you would need to check everything for emptyness, not pretty ;-)
– t3chb0t
14 hours ago
@t3chb0t only most LINQ methods don't explicitly loop over one of the inputs as it consumes the other. I would probably expect an exception here if only one of them was empty, and an empty enumerable (yield break
) if both are empty.
– VisualMelon
13 hours ago
@VisualMelon no way :-P this is not how it should work. No collection returning LINQ extensions throw exceptions if the source is empty. Compare thisnew { 1 }.Zip(Enumerable.Empty<int>(), (x, y) => (x, y)).Dump();
The result is an empty collection.
– t3chb0t
13 hours ago
@t3chb0t indeed, but that's because it's defined as zipping as far as the shortest. I'd argue that extracting values repeatedly from an empty collection is meaningless, and so it should throw (as perAverage
). Anyhow, I can see your argument, so I think we'll have to agree to disagree :P
– VisualMelon
13 hours ago
|
show 7 more comments
up vote
6
down vote
I noticed a few things that can be improved:
- Not all enumerators support
Reset
. Generator methods don't, for example, so callingZipNew
on the result of aZipNew
call will fail with aNotSupportedException
. Obtaining a new enumerator should work, at the cost of having to replace the convenientusing
statements withtry/finally
constructions. - There's no need to call
Reset
when a collection is empty. I'd probably add a special case for that. - Passing
null
causes either an unspecificNullReferenceException
or anArgumentNullException
with parameter namesource
to be thrown. ThrowingArgumentNullException
s with accurate parameter names would be more helpful.
i1
andi2
can be declared inside the while loop.
add a comment |
up vote
2
down vote
If you should respect that not all Enumerators
implement Reset()
then it is not possible to use using
statements for the two IEnumerators
. But you could introduce an IEnumerator<TResult>
for the zipped result and use it like this:
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> resultSelector)
{
using (ZipEnumerator<TFirst, TSecond, TResult> zipEnumerator = new ZipEnumerator<TFirst, TSecond, TResult>(first, second, resultSelector))
{
while (zipEnumerator.MoveNext())
{
yield return zipEnumerator.Current;
}
}
}
In this way you're back on the using track.
The ZipEnumerator
it self could be something like:
public class ZipEnumerator<T, S, TResult> : IEnumerator<TResult>
{
IEnumerable<T> m_dataT;
IEnumerable<S> m_dataS;
IEnumerator<T> m_enumerT;
IEnumerator<S> m_enumerS;
List<IDisposable> m_disposables = new List<IDisposable>();
Func<T, S, TResult> m_selector;
bool m_secondReloaded = false;
bool m_first = true;
public ZipEnumerator(IEnumerable<T> dataT, IEnumerable<S> dataS, Func<T, S, TResult> selector)
{
m_dataT = dataT ?? throw new ArgumentNullException(nameof(dataT));
m_dataS = dataS ?? throw new ArgumentNullException(nameof(dataS));
m_selector = selector ?? throw new ArgumentNullException(nameof(selector));
}
public TResult Current => m_selector(m_enumerT.Current, m_enumerS.Current);
object IEnumerator.Current => Current;
public void Dispose()
{
foreach (IDisposable disposable in m_disposables)
{
disposable.Dispose();
}
m_disposables.Clear();
}
private IEnumerator<T> GetTEnumerator()
{
var enumerator = m_dataT.GetEnumerator();
m_disposables.Add(enumerator);
return enumerator;
}
private IEnumerator<S> GetSEnumerator()
{
var enumerator = m_dataS.GetEnumerator();
m_disposables.Add(enumerator);
return enumerator;
}
public bool MoveNext()
{
m_enumerT = m_enumerT ?? GetTEnumerator();
m_enumerS = m_enumerS ?? GetSEnumerator();
if (m_first)
{
if (m_enumerT.MoveNext())
{
if (!m_enumerS.MoveNext())
{
m_enumerS = GetSEnumerator();
m_secondReloaded = true;
if (!m_enumerS.MoveNext())
return false;
}
return true;
}
else
{
m_first = false;
}
}
if (!m_first && !m_secondReloaded)
{
if (m_enumerS.MoveNext())
{
if (!m_enumerT.MoveNext())
{
m_enumerT = GetTEnumerator();
if (!m_enumerT.MoveNext())
return false;
}
return true;
}
}
return false;
}
public void Reset()
{
m_secondReloaded = false;
m_first = true;
m_enumerT = null;
m_enumerS = null;
Dispose();
}
}
It's a little more code than other suggestions, but it encapsulates the problems with the disposal of intermediate enumerators without the necessity of a try-catch-statement. You could discuss if the disposal should be immediately when the enumerator is done or as I do collect them for disposal when the ZipEnumerator
itself is disposed off?
The MoveNext()
method went a little more complicated than I like, so feel free to edit or suggest improvements.
Edit
A refactored version of ZipEnumerator
:
public class ZipEnumerator<T, S, TResult> : IEnumerator<TResult>
{
IEnumerable<T> m_dataT;
IEnumerable<S> m_dataS;
IEnumerator<T> m_enumeratorT;
IEnumerator<S> m_enumeratorS;
List<IDisposable> m_disposables = new List<IDisposable>();
Func<T, S, TResult> m_selector;
bool m_secondReloaded = false;
bool m_isInitilized = false;
public ZipEnumerator(IEnumerable<T> dataT, IEnumerable<S> dataS, Func<T, S, TResult> selector)
{
m_dataT = dataT ?? throw new ArgumentNullException(nameof(dataT));
m_dataS = dataS ?? throw new ArgumentNullException(nameof(dataS));
m_selector = selector ?? throw new ArgumentNullException(nameof(selector));
}
public TResult Current => m_selector(m_enumeratorT.Current, m_enumeratorS.Current);
object IEnumerator.Current => Current;
public void Dispose()
{
DoDispose();
}
private void RegisterDisposable(IDisposable disposable)
{
m_disposables.Add(disposable);
if (m_disposables.Count > 10)
{
DoDispose();
}
}
private void DoDispose()
{
foreach (IDisposable disposable in m_disposables)
{
disposable.Dispose();
}
m_disposables.Clear();
}
private IEnumerator<T> GetTEnumerator()
{
var enumerator = m_dataT.GetEnumerator();
RegisterDisposable(enumerator);
return enumerator;
}
private IEnumerator<S> GetSEnumerator()
{
var enumerator = m_dataS.GetEnumerator();
RegisterDisposable(enumerator);
return enumerator;
}
private Func<bool> CurrentMover = null;
private bool FirstMover()
{
if (m_enumeratorT.MoveNext())
{
if (!m_enumeratorS.MoveNext())
{
m_enumeratorS = GetSEnumerator();
m_secondReloaded = true;
if (!m_enumeratorS.MoveNext())
return false;
}
return true;
}
else if (!m_secondReloaded)
{
CurrentMover = SecondMover;
return CurrentMover();
}
return false;
}
private bool SecondMover()
{
if (m_enumeratorS.MoveNext())
{
if (!m_enumeratorT.MoveNext())
{
m_enumeratorT = GetTEnumerator();
if (!m_enumeratorT.MoveNext())
return false;
}
return true;
}
return false;
}
private void Initialize()
{
m_enumeratorT = GetTEnumerator();
m_enumeratorS = GetSEnumerator();
CurrentMover = FirstMover;
m_isInitilized = true;
}
public bool MoveNext()
{
if (!m_isInitilized)
{
Initialize();
}
return CurrentMover();
}
public void Reset()
{
m_isInitilized = false;
m_secondReloaded = false;
CurrentMover = null;
m_enumeratorT = null;
m_enumeratorS = null;
DoDispose();
}
}
I'd like to click +1 but when I see the variable names it says -1 so at the and it's a 0 ;-)
– t3chb0t
11 hours ago
I think the word unconventional describes them pretty good... althoughts
andss
were below that level ;-]
– t3chb0t
11 hours ago
I can explain that ;-P It's local and in a very small scope, this is allowed in my world,tt
andss
on the other hand werepublic
.
– t3chb0t
11 hours ago
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
6
down vote
public static class Impl
{
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
Names? The class would be more descriptive as something like LinqExtensions
; the method something like ZipLooped
.
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
{
var i1 = true;
var i2 = true;
var i1Shorter = false;
var i2Shorter = false;
var firstRun = true;
The iterators have useful names, but what does i1
mean? And why five variables to track the state of two iterators? IMO it would be simpler as
var firstEnded = false;
var secondEnded = false;
while (true)
{
if (!iterator1.MoveNext())
{
if (secondEnded) yield break;
firstEnded = true;
iterator1.Reset();
if (!iterator1.MoveNext()) yield break;
}
if (!iterator2.MoveNext())
{
if (firstEnded) yield break;
secondEnded = true;
iterator2.Reset();
if (!iterator2.MoveNext()) yield break;
}
yield return resultSelector(iterator1.Current, iterator2.Current);
}
and the almost repeated code might be worth pulling out as an inner method:
var firstEnded = false;
var secondEnded = false;
bool advance<T>(IEnumerator<T> it, ref bool thisEnded, bool otherEnded)
{
if (it.MoveNext()) return true;
// `it` has done a full cycle; if the other one has too, we've finished
if (otherEnded) return false;
thisEnded = true;
// Start again, although if `it` is empty we need to abort
it.Reset();
return it.MoveNext();
}
while (true)
{
if (!advance(iterator1, ref firstEnded, secondEnded)) yield break;
if (!advance(iterator2, ref secondEnded, firstEnded)) yield break;
yield return resultSelector(iterator1.Current, iterator2.Current);
}
I notice that you've decided to yield break
if either of the enumerables is empty. Would an exception be a better choice?
I'd renameadvance
inTryMoveNextOrLoop
– t3chb0t
14 hours ago
I notice that you've decided to yield break if either of the enumerables is empty. - This is the expected behaviour. I'd be surprised if it was something else and this makes linq so reliable - nothing there so nothing happens - otherwise you would need to check everything for emptyness, not pretty ;-)
– t3chb0t
14 hours ago
@t3chb0t only most LINQ methods don't explicitly loop over one of the inputs as it consumes the other. I would probably expect an exception here if only one of them was empty, and an empty enumerable (yield break
) if both are empty.
– VisualMelon
13 hours ago
@VisualMelon no way :-P this is not how it should work. No collection returning LINQ extensions throw exceptions if the source is empty. Compare thisnew { 1 }.Zip(Enumerable.Empty<int>(), (x, y) => (x, y)).Dump();
The result is an empty collection.
– t3chb0t
13 hours ago
@t3chb0t indeed, but that's because it's defined as zipping as far as the shortest. I'd argue that extracting values repeatedly from an empty collection is meaningless, and so it should throw (as perAverage
). Anyhow, I can see your argument, so I think we'll have to agree to disagree :P
– VisualMelon
13 hours ago
|
show 7 more comments
up vote
6
down vote
public static class Impl
{
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
Names? The class would be more descriptive as something like LinqExtensions
; the method something like ZipLooped
.
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
{
var i1 = true;
var i2 = true;
var i1Shorter = false;
var i2Shorter = false;
var firstRun = true;
The iterators have useful names, but what does i1
mean? And why five variables to track the state of two iterators? IMO it would be simpler as
var firstEnded = false;
var secondEnded = false;
while (true)
{
if (!iterator1.MoveNext())
{
if (secondEnded) yield break;
firstEnded = true;
iterator1.Reset();
if (!iterator1.MoveNext()) yield break;
}
if (!iterator2.MoveNext())
{
if (firstEnded) yield break;
secondEnded = true;
iterator2.Reset();
if (!iterator2.MoveNext()) yield break;
}
yield return resultSelector(iterator1.Current, iterator2.Current);
}
and the almost repeated code might be worth pulling out as an inner method:
var firstEnded = false;
var secondEnded = false;
bool advance<T>(IEnumerator<T> it, ref bool thisEnded, bool otherEnded)
{
if (it.MoveNext()) return true;
// `it` has done a full cycle; if the other one has too, we've finished
if (otherEnded) return false;
thisEnded = true;
// Start again, although if `it` is empty we need to abort
it.Reset();
return it.MoveNext();
}
while (true)
{
if (!advance(iterator1, ref firstEnded, secondEnded)) yield break;
if (!advance(iterator2, ref secondEnded, firstEnded)) yield break;
yield return resultSelector(iterator1.Current, iterator2.Current);
}
I notice that you've decided to yield break
if either of the enumerables is empty. Would an exception be a better choice?
I'd renameadvance
inTryMoveNextOrLoop
– t3chb0t
14 hours ago
I notice that you've decided to yield break if either of the enumerables is empty. - This is the expected behaviour. I'd be surprised if it was something else and this makes linq so reliable - nothing there so nothing happens - otherwise you would need to check everything for emptyness, not pretty ;-)
– t3chb0t
14 hours ago
@t3chb0t only most LINQ methods don't explicitly loop over one of the inputs as it consumes the other. I would probably expect an exception here if only one of them was empty, and an empty enumerable (yield break
) if both are empty.
– VisualMelon
13 hours ago
@VisualMelon no way :-P this is not how it should work. No collection returning LINQ extensions throw exceptions if the source is empty. Compare thisnew { 1 }.Zip(Enumerable.Empty<int>(), (x, y) => (x, y)).Dump();
The result is an empty collection.
– t3chb0t
13 hours ago
@t3chb0t indeed, but that's because it's defined as zipping as far as the shortest. I'd argue that extracting values repeatedly from an empty collection is meaningless, and so it should throw (as perAverage
). Anyhow, I can see your argument, so I think we'll have to agree to disagree :P
– VisualMelon
13 hours ago
|
show 7 more comments
up vote
6
down vote
up vote
6
down vote
public static class Impl
{
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
Names? The class would be more descriptive as something like LinqExtensions
; the method something like ZipLooped
.
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
{
var i1 = true;
var i2 = true;
var i1Shorter = false;
var i2Shorter = false;
var firstRun = true;
The iterators have useful names, but what does i1
mean? And why five variables to track the state of two iterators? IMO it would be simpler as
var firstEnded = false;
var secondEnded = false;
while (true)
{
if (!iterator1.MoveNext())
{
if (secondEnded) yield break;
firstEnded = true;
iterator1.Reset();
if (!iterator1.MoveNext()) yield break;
}
if (!iterator2.MoveNext())
{
if (firstEnded) yield break;
secondEnded = true;
iterator2.Reset();
if (!iterator2.MoveNext()) yield break;
}
yield return resultSelector(iterator1.Current, iterator2.Current);
}
and the almost repeated code might be worth pulling out as an inner method:
var firstEnded = false;
var secondEnded = false;
bool advance<T>(IEnumerator<T> it, ref bool thisEnded, bool otherEnded)
{
if (it.MoveNext()) return true;
// `it` has done a full cycle; if the other one has too, we've finished
if (otherEnded) return false;
thisEnded = true;
// Start again, although if `it` is empty we need to abort
it.Reset();
return it.MoveNext();
}
while (true)
{
if (!advance(iterator1, ref firstEnded, secondEnded)) yield break;
if (!advance(iterator2, ref secondEnded, firstEnded)) yield break;
yield return resultSelector(iterator1.Current, iterator2.Current);
}
I notice that you've decided to yield break
if either of the enumerables is empty. Would an exception be a better choice?
public static class Impl
{
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
Names? The class would be more descriptive as something like LinqExtensions
; the method something like ZipLooped
.
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
{
var i1 = true;
var i2 = true;
var i1Shorter = false;
var i2Shorter = false;
var firstRun = true;
The iterators have useful names, but what does i1
mean? And why five variables to track the state of two iterators? IMO it would be simpler as
var firstEnded = false;
var secondEnded = false;
while (true)
{
if (!iterator1.MoveNext())
{
if (secondEnded) yield break;
firstEnded = true;
iterator1.Reset();
if (!iterator1.MoveNext()) yield break;
}
if (!iterator2.MoveNext())
{
if (firstEnded) yield break;
secondEnded = true;
iterator2.Reset();
if (!iterator2.MoveNext()) yield break;
}
yield return resultSelector(iterator1.Current, iterator2.Current);
}
and the almost repeated code might be worth pulling out as an inner method:
var firstEnded = false;
var secondEnded = false;
bool advance<T>(IEnumerator<T> it, ref bool thisEnded, bool otherEnded)
{
if (it.MoveNext()) return true;
// `it` has done a full cycle; if the other one has too, we've finished
if (otherEnded) return false;
thisEnded = true;
// Start again, although if `it` is empty we need to abort
it.Reset();
return it.MoveNext();
}
while (true)
{
if (!advance(iterator1, ref firstEnded, secondEnded)) yield break;
if (!advance(iterator2, ref secondEnded, firstEnded)) yield break;
yield return resultSelector(iterator1.Current, iterator2.Current);
}
I notice that you've decided to yield break
if either of the enumerables is empty. Would an exception be a better choice?
answered 16 hours ago
Peter Taylor
15.3k2657
15.3k2657
I'd renameadvance
inTryMoveNextOrLoop
– t3chb0t
14 hours ago
I notice that you've decided to yield break if either of the enumerables is empty. - This is the expected behaviour. I'd be surprised if it was something else and this makes linq so reliable - nothing there so nothing happens - otherwise you would need to check everything for emptyness, not pretty ;-)
– t3chb0t
14 hours ago
@t3chb0t only most LINQ methods don't explicitly loop over one of the inputs as it consumes the other. I would probably expect an exception here if only one of them was empty, and an empty enumerable (yield break
) if both are empty.
– VisualMelon
13 hours ago
@VisualMelon no way :-P this is not how it should work. No collection returning LINQ extensions throw exceptions if the source is empty. Compare thisnew { 1 }.Zip(Enumerable.Empty<int>(), (x, y) => (x, y)).Dump();
The result is an empty collection.
– t3chb0t
13 hours ago
@t3chb0t indeed, but that's because it's defined as zipping as far as the shortest. I'd argue that extracting values repeatedly from an empty collection is meaningless, and so it should throw (as perAverage
). Anyhow, I can see your argument, so I think we'll have to agree to disagree :P
– VisualMelon
13 hours ago
|
show 7 more comments
I'd renameadvance
inTryMoveNextOrLoop
– t3chb0t
14 hours ago
I notice that you've decided to yield break if either of the enumerables is empty. - This is the expected behaviour. I'd be surprised if it was something else and this makes linq so reliable - nothing there so nothing happens - otherwise you would need to check everything for emptyness, not pretty ;-)
– t3chb0t
14 hours ago
@t3chb0t only most LINQ methods don't explicitly loop over one of the inputs as it consumes the other. I would probably expect an exception here if only one of them was empty, and an empty enumerable (yield break
) if both are empty.
– VisualMelon
13 hours ago
@VisualMelon no way :-P this is not how it should work. No collection returning LINQ extensions throw exceptions if the source is empty. Compare thisnew { 1 }.Zip(Enumerable.Empty<int>(), (x, y) => (x, y)).Dump();
The result is an empty collection.
– t3chb0t
13 hours ago
@t3chb0t indeed, but that's because it's defined as zipping as far as the shortest. I'd argue that extracting values repeatedly from an empty collection is meaningless, and so it should throw (as perAverage
). Anyhow, I can see your argument, so I think we'll have to agree to disagree :P
– VisualMelon
13 hours ago
I'd rename
advance
in TryMoveNextOrLoop
– t3chb0t
14 hours ago
I'd rename
advance
in TryMoveNextOrLoop
– t3chb0t
14 hours ago
I notice that you've decided to yield break if either of the enumerables is empty. - This is the expected behaviour. I'd be surprised if it was something else and this makes linq so reliable - nothing there so nothing happens - otherwise you would need to check everything for emptyness, not pretty ;-)
– t3chb0t
14 hours ago
I notice that you've decided to yield break if either of the enumerables is empty. - This is the expected behaviour. I'd be surprised if it was something else and this makes linq so reliable - nothing there so nothing happens - otherwise you would need to check everything for emptyness, not pretty ;-)
– t3chb0t
14 hours ago
@t3chb0t only most LINQ methods don't explicitly loop over one of the inputs as it consumes the other. I would probably expect an exception here if only one of them was empty, and an empty enumerable (
yield break
) if both are empty.– VisualMelon
13 hours ago
@t3chb0t only most LINQ methods don't explicitly loop over one of the inputs as it consumes the other. I would probably expect an exception here if only one of them was empty, and an empty enumerable (
yield break
) if both are empty.– VisualMelon
13 hours ago
@VisualMelon no way :-P this is not how it should work. No collection returning LINQ extensions throw exceptions if the source is empty. Compare this
new { 1 }.Zip(Enumerable.Empty<int>(), (x, y) => (x, y)).Dump();
The result is an empty collection.– t3chb0t
13 hours ago
@VisualMelon no way :-P this is not how it should work. No collection returning LINQ extensions throw exceptions if the source is empty. Compare this
new { 1 }.Zip(Enumerable.Empty<int>(), (x, y) => (x, y)).Dump();
The result is an empty collection.– t3chb0t
13 hours ago
@t3chb0t indeed, but that's because it's defined as zipping as far as the shortest. I'd argue that extracting values repeatedly from an empty collection is meaningless, and so it should throw (as per
Average
). Anyhow, I can see your argument, so I think we'll have to agree to disagree :P– VisualMelon
13 hours ago
@t3chb0t indeed, but that's because it's defined as zipping as far as the shortest. I'd argue that extracting values repeatedly from an empty collection is meaningless, and so it should throw (as per
Average
). Anyhow, I can see your argument, so I think we'll have to agree to disagree :P– VisualMelon
13 hours ago
|
show 7 more comments
up vote
6
down vote
I noticed a few things that can be improved:
- Not all enumerators support
Reset
. Generator methods don't, for example, so callingZipNew
on the result of aZipNew
call will fail with aNotSupportedException
. Obtaining a new enumerator should work, at the cost of having to replace the convenientusing
statements withtry/finally
constructions. - There's no need to call
Reset
when a collection is empty. I'd probably add a special case for that. - Passing
null
causes either an unspecificNullReferenceException
or anArgumentNullException
with parameter namesource
to be thrown. ThrowingArgumentNullException
s with accurate parameter names would be more helpful.
i1
andi2
can be declared inside the while loop.
add a comment |
up vote
6
down vote
I noticed a few things that can be improved:
- Not all enumerators support
Reset
. Generator methods don't, for example, so callingZipNew
on the result of aZipNew
call will fail with aNotSupportedException
. Obtaining a new enumerator should work, at the cost of having to replace the convenientusing
statements withtry/finally
constructions. - There's no need to call
Reset
when a collection is empty. I'd probably add a special case for that. - Passing
null
causes either an unspecificNullReferenceException
or anArgumentNullException
with parameter namesource
to be thrown. ThrowingArgumentNullException
s with accurate parameter names would be more helpful.
i1
andi2
can be declared inside the while loop.
add a comment |
up vote
6
down vote
up vote
6
down vote
I noticed a few things that can be improved:
- Not all enumerators support
Reset
. Generator methods don't, for example, so callingZipNew
on the result of aZipNew
call will fail with aNotSupportedException
. Obtaining a new enumerator should work, at the cost of having to replace the convenientusing
statements withtry/finally
constructions. - There's no need to call
Reset
when a collection is empty. I'd probably add a special case for that. - Passing
null
causes either an unspecificNullReferenceException
or anArgumentNullException
with parameter namesource
to be thrown. ThrowingArgumentNullException
s with accurate parameter names would be more helpful.
i1
andi2
can be declared inside the while loop.
I noticed a few things that can be improved:
- Not all enumerators support
Reset
. Generator methods don't, for example, so callingZipNew
on the result of aZipNew
call will fail with aNotSupportedException
. Obtaining a new enumerator should work, at the cost of having to replace the convenientusing
statements withtry/finally
constructions. - There's no need to call
Reset
when a collection is empty. I'd probably add a special case for that. - Passing
null
causes either an unspecificNullReferenceException
or anArgumentNullException
with parameter namesource
to be thrown. ThrowingArgumentNullException
s with accurate parameter names would be more helpful.
i1
andi2
can be declared inside the while loop.
answered 15 hours ago
Pieter Witvoet
4,846723
4,846723
add a comment |
add a comment |
up vote
2
down vote
If you should respect that not all Enumerators
implement Reset()
then it is not possible to use using
statements for the two IEnumerators
. But you could introduce an IEnumerator<TResult>
for the zipped result and use it like this:
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> resultSelector)
{
using (ZipEnumerator<TFirst, TSecond, TResult> zipEnumerator = new ZipEnumerator<TFirst, TSecond, TResult>(first, second, resultSelector))
{
while (zipEnumerator.MoveNext())
{
yield return zipEnumerator.Current;
}
}
}
In this way you're back on the using track.
The ZipEnumerator
it self could be something like:
public class ZipEnumerator<T, S, TResult> : IEnumerator<TResult>
{
IEnumerable<T> m_dataT;
IEnumerable<S> m_dataS;
IEnumerator<T> m_enumerT;
IEnumerator<S> m_enumerS;
List<IDisposable> m_disposables = new List<IDisposable>();
Func<T, S, TResult> m_selector;
bool m_secondReloaded = false;
bool m_first = true;
public ZipEnumerator(IEnumerable<T> dataT, IEnumerable<S> dataS, Func<T, S, TResult> selector)
{
m_dataT = dataT ?? throw new ArgumentNullException(nameof(dataT));
m_dataS = dataS ?? throw new ArgumentNullException(nameof(dataS));
m_selector = selector ?? throw new ArgumentNullException(nameof(selector));
}
public TResult Current => m_selector(m_enumerT.Current, m_enumerS.Current);
object IEnumerator.Current => Current;
public void Dispose()
{
foreach (IDisposable disposable in m_disposables)
{
disposable.Dispose();
}
m_disposables.Clear();
}
private IEnumerator<T> GetTEnumerator()
{
var enumerator = m_dataT.GetEnumerator();
m_disposables.Add(enumerator);
return enumerator;
}
private IEnumerator<S> GetSEnumerator()
{
var enumerator = m_dataS.GetEnumerator();
m_disposables.Add(enumerator);
return enumerator;
}
public bool MoveNext()
{
m_enumerT = m_enumerT ?? GetTEnumerator();
m_enumerS = m_enumerS ?? GetSEnumerator();
if (m_first)
{
if (m_enumerT.MoveNext())
{
if (!m_enumerS.MoveNext())
{
m_enumerS = GetSEnumerator();
m_secondReloaded = true;
if (!m_enumerS.MoveNext())
return false;
}
return true;
}
else
{
m_first = false;
}
}
if (!m_first && !m_secondReloaded)
{
if (m_enumerS.MoveNext())
{
if (!m_enumerT.MoveNext())
{
m_enumerT = GetTEnumerator();
if (!m_enumerT.MoveNext())
return false;
}
return true;
}
}
return false;
}
public void Reset()
{
m_secondReloaded = false;
m_first = true;
m_enumerT = null;
m_enumerS = null;
Dispose();
}
}
It's a little more code than other suggestions, but it encapsulates the problems with the disposal of intermediate enumerators without the necessity of a try-catch-statement. You could discuss if the disposal should be immediately when the enumerator is done or as I do collect them for disposal when the ZipEnumerator
itself is disposed off?
The MoveNext()
method went a little more complicated than I like, so feel free to edit or suggest improvements.
Edit
A refactored version of ZipEnumerator
:
public class ZipEnumerator<T, S, TResult> : IEnumerator<TResult>
{
IEnumerable<T> m_dataT;
IEnumerable<S> m_dataS;
IEnumerator<T> m_enumeratorT;
IEnumerator<S> m_enumeratorS;
List<IDisposable> m_disposables = new List<IDisposable>();
Func<T, S, TResult> m_selector;
bool m_secondReloaded = false;
bool m_isInitilized = false;
public ZipEnumerator(IEnumerable<T> dataT, IEnumerable<S> dataS, Func<T, S, TResult> selector)
{
m_dataT = dataT ?? throw new ArgumentNullException(nameof(dataT));
m_dataS = dataS ?? throw new ArgumentNullException(nameof(dataS));
m_selector = selector ?? throw new ArgumentNullException(nameof(selector));
}
public TResult Current => m_selector(m_enumeratorT.Current, m_enumeratorS.Current);
object IEnumerator.Current => Current;
public void Dispose()
{
DoDispose();
}
private void RegisterDisposable(IDisposable disposable)
{
m_disposables.Add(disposable);
if (m_disposables.Count > 10)
{
DoDispose();
}
}
private void DoDispose()
{
foreach (IDisposable disposable in m_disposables)
{
disposable.Dispose();
}
m_disposables.Clear();
}
private IEnumerator<T> GetTEnumerator()
{
var enumerator = m_dataT.GetEnumerator();
RegisterDisposable(enumerator);
return enumerator;
}
private IEnumerator<S> GetSEnumerator()
{
var enumerator = m_dataS.GetEnumerator();
RegisterDisposable(enumerator);
return enumerator;
}
private Func<bool> CurrentMover = null;
private bool FirstMover()
{
if (m_enumeratorT.MoveNext())
{
if (!m_enumeratorS.MoveNext())
{
m_enumeratorS = GetSEnumerator();
m_secondReloaded = true;
if (!m_enumeratorS.MoveNext())
return false;
}
return true;
}
else if (!m_secondReloaded)
{
CurrentMover = SecondMover;
return CurrentMover();
}
return false;
}
private bool SecondMover()
{
if (m_enumeratorS.MoveNext())
{
if (!m_enumeratorT.MoveNext())
{
m_enumeratorT = GetTEnumerator();
if (!m_enumeratorT.MoveNext())
return false;
}
return true;
}
return false;
}
private void Initialize()
{
m_enumeratorT = GetTEnumerator();
m_enumeratorS = GetSEnumerator();
CurrentMover = FirstMover;
m_isInitilized = true;
}
public bool MoveNext()
{
if (!m_isInitilized)
{
Initialize();
}
return CurrentMover();
}
public void Reset()
{
m_isInitilized = false;
m_secondReloaded = false;
CurrentMover = null;
m_enumeratorT = null;
m_enumeratorS = null;
DoDispose();
}
}
I'd like to click +1 but when I see the variable names it says -1 so at the and it's a 0 ;-)
– t3chb0t
11 hours ago
I think the word unconventional describes them pretty good... althoughts
andss
were below that level ;-]
– t3chb0t
11 hours ago
I can explain that ;-P It's local and in a very small scope, this is allowed in my world,tt
andss
on the other hand werepublic
.
– t3chb0t
11 hours ago
add a comment |
up vote
2
down vote
If you should respect that not all Enumerators
implement Reset()
then it is not possible to use using
statements for the two IEnumerators
. But you could introduce an IEnumerator<TResult>
for the zipped result and use it like this:
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> resultSelector)
{
using (ZipEnumerator<TFirst, TSecond, TResult> zipEnumerator = new ZipEnumerator<TFirst, TSecond, TResult>(first, second, resultSelector))
{
while (zipEnumerator.MoveNext())
{
yield return zipEnumerator.Current;
}
}
}
In this way you're back on the using track.
The ZipEnumerator
it self could be something like:
public class ZipEnumerator<T, S, TResult> : IEnumerator<TResult>
{
IEnumerable<T> m_dataT;
IEnumerable<S> m_dataS;
IEnumerator<T> m_enumerT;
IEnumerator<S> m_enumerS;
List<IDisposable> m_disposables = new List<IDisposable>();
Func<T, S, TResult> m_selector;
bool m_secondReloaded = false;
bool m_first = true;
public ZipEnumerator(IEnumerable<T> dataT, IEnumerable<S> dataS, Func<T, S, TResult> selector)
{
m_dataT = dataT ?? throw new ArgumentNullException(nameof(dataT));
m_dataS = dataS ?? throw new ArgumentNullException(nameof(dataS));
m_selector = selector ?? throw new ArgumentNullException(nameof(selector));
}
public TResult Current => m_selector(m_enumerT.Current, m_enumerS.Current);
object IEnumerator.Current => Current;
public void Dispose()
{
foreach (IDisposable disposable in m_disposables)
{
disposable.Dispose();
}
m_disposables.Clear();
}
private IEnumerator<T> GetTEnumerator()
{
var enumerator = m_dataT.GetEnumerator();
m_disposables.Add(enumerator);
return enumerator;
}
private IEnumerator<S> GetSEnumerator()
{
var enumerator = m_dataS.GetEnumerator();
m_disposables.Add(enumerator);
return enumerator;
}
public bool MoveNext()
{
m_enumerT = m_enumerT ?? GetTEnumerator();
m_enumerS = m_enumerS ?? GetSEnumerator();
if (m_first)
{
if (m_enumerT.MoveNext())
{
if (!m_enumerS.MoveNext())
{
m_enumerS = GetSEnumerator();
m_secondReloaded = true;
if (!m_enumerS.MoveNext())
return false;
}
return true;
}
else
{
m_first = false;
}
}
if (!m_first && !m_secondReloaded)
{
if (m_enumerS.MoveNext())
{
if (!m_enumerT.MoveNext())
{
m_enumerT = GetTEnumerator();
if (!m_enumerT.MoveNext())
return false;
}
return true;
}
}
return false;
}
public void Reset()
{
m_secondReloaded = false;
m_first = true;
m_enumerT = null;
m_enumerS = null;
Dispose();
}
}
It's a little more code than other suggestions, but it encapsulates the problems with the disposal of intermediate enumerators without the necessity of a try-catch-statement. You could discuss if the disposal should be immediately when the enumerator is done or as I do collect them for disposal when the ZipEnumerator
itself is disposed off?
The MoveNext()
method went a little more complicated than I like, so feel free to edit or suggest improvements.
Edit
A refactored version of ZipEnumerator
:
public class ZipEnumerator<T, S, TResult> : IEnumerator<TResult>
{
IEnumerable<T> m_dataT;
IEnumerable<S> m_dataS;
IEnumerator<T> m_enumeratorT;
IEnumerator<S> m_enumeratorS;
List<IDisposable> m_disposables = new List<IDisposable>();
Func<T, S, TResult> m_selector;
bool m_secondReloaded = false;
bool m_isInitilized = false;
public ZipEnumerator(IEnumerable<T> dataT, IEnumerable<S> dataS, Func<T, S, TResult> selector)
{
m_dataT = dataT ?? throw new ArgumentNullException(nameof(dataT));
m_dataS = dataS ?? throw new ArgumentNullException(nameof(dataS));
m_selector = selector ?? throw new ArgumentNullException(nameof(selector));
}
public TResult Current => m_selector(m_enumeratorT.Current, m_enumeratorS.Current);
object IEnumerator.Current => Current;
public void Dispose()
{
DoDispose();
}
private void RegisterDisposable(IDisposable disposable)
{
m_disposables.Add(disposable);
if (m_disposables.Count > 10)
{
DoDispose();
}
}
private void DoDispose()
{
foreach (IDisposable disposable in m_disposables)
{
disposable.Dispose();
}
m_disposables.Clear();
}
private IEnumerator<T> GetTEnumerator()
{
var enumerator = m_dataT.GetEnumerator();
RegisterDisposable(enumerator);
return enumerator;
}
private IEnumerator<S> GetSEnumerator()
{
var enumerator = m_dataS.GetEnumerator();
RegisterDisposable(enumerator);
return enumerator;
}
private Func<bool> CurrentMover = null;
private bool FirstMover()
{
if (m_enumeratorT.MoveNext())
{
if (!m_enumeratorS.MoveNext())
{
m_enumeratorS = GetSEnumerator();
m_secondReloaded = true;
if (!m_enumeratorS.MoveNext())
return false;
}
return true;
}
else if (!m_secondReloaded)
{
CurrentMover = SecondMover;
return CurrentMover();
}
return false;
}
private bool SecondMover()
{
if (m_enumeratorS.MoveNext())
{
if (!m_enumeratorT.MoveNext())
{
m_enumeratorT = GetTEnumerator();
if (!m_enumeratorT.MoveNext())
return false;
}
return true;
}
return false;
}
private void Initialize()
{
m_enumeratorT = GetTEnumerator();
m_enumeratorS = GetSEnumerator();
CurrentMover = FirstMover;
m_isInitilized = true;
}
public bool MoveNext()
{
if (!m_isInitilized)
{
Initialize();
}
return CurrentMover();
}
public void Reset()
{
m_isInitilized = false;
m_secondReloaded = false;
CurrentMover = null;
m_enumeratorT = null;
m_enumeratorS = null;
DoDispose();
}
}
I'd like to click +1 but when I see the variable names it says -1 so at the and it's a 0 ;-)
– t3chb0t
11 hours ago
I think the word unconventional describes them pretty good... althoughts
andss
were below that level ;-]
– t3chb0t
11 hours ago
I can explain that ;-P It's local and in a very small scope, this is allowed in my world,tt
andss
on the other hand werepublic
.
– t3chb0t
11 hours ago
add a comment |
up vote
2
down vote
up vote
2
down vote
If you should respect that not all Enumerators
implement Reset()
then it is not possible to use using
statements for the two IEnumerators
. But you could introduce an IEnumerator<TResult>
for the zipped result and use it like this:
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> resultSelector)
{
using (ZipEnumerator<TFirst, TSecond, TResult> zipEnumerator = new ZipEnumerator<TFirst, TSecond, TResult>(first, second, resultSelector))
{
while (zipEnumerator.MoveNext())
{
yield return zipEnumerator.Current;
}
}
}
In this way you're back on the using track.
The ZipEnumerator
it self could be something like:
public class ZipEnumerator<T, S, TResult> : IEnumerator<TResult>
{
IEnumerable<T> m_dataT;
IEnumerable<S> m_dataS;
IEnumerator<T> m_enumerT;
IEnumerator<S> m_enumerS;
List<IDisposable> m_disposables = new List<IDisposable>();
Func<T, S, TResult> m_selector;
bool m_secondReloaded = false;
bool m_first = true;
public ZipEnumerator(IEnumerable<T> dataT, IEnumerable<S> dataS, Func<T, S, TResult> selector)
{
m_dataT = dataT ?? throw new ArgumentNullException(nameof(dataT));
m_dataS = dataS ?? throw new ArgumentNullException(nameof(dataS));
m_selector = selector ?? throw new ArgumentNullException(nameof(selector));
}
public TResult Current => m_selector(m_enumerT.Current, m_enumerS.Current);
object IEnumerator.Current => Current;
public void Dispose()
{
foreach (IDisposable disposable in m_disposables)
{
disposable.Dispose();
}
m_disposables.Clear();
}
private IEnumerator<T> GetTEnumerator()
{
var enumerator = m_dataT.GetEnumerator();
m_disposables.Add(enumerator);
return enumerator;
}
private IEnumerator<S> GetSEnumerator()
{
var enumerator = m_dataS.GetEnumerator();
m_disposables.Add(enumerator);
return enumerator;
}
public bool MoveNext()
{
m_enumerT = m_enumerT ?? GetTEnumerator();
m_enumerS = m_enumerS ?? GetSEnumerator();
if (m_first)
{
if (m_enumerT.MoveNext())
{
if (!m_enumerS.MoveNext())
{
m_enumerS = GetSEnumerator();
m_secondReloaded = true;
if (!m_enumerS.MoveNext())
return false;
}
return true;
}
else
{
m_first = false;
}
}
if (!m_first && !m_secondReloaded)
{
if (m_enumerS.MoveNext())
{
if (!m_enumerT.MoveNext())
{
m_enumerT = GetTEnumerator();
if (!m_enumerT.MoveNext())
return false;
}
return true;
}
}
return false;
}
public void Reset()
{
m_secondReloaded = false;
m_first = true;
m_enumerT = null;
m_enumerS = null;
Dispose();
}
}
It's a little more code than other suggestions, but it encapsulates the problems with the disposal of intermediate enumerators without the necessity of a try-catch-statement. You could discuss if the disposal should be immediately when the enumerator is done or as I do collect them for disposal when the ZipEnumerator
itself is disposed off?
The MoveNext()
method went a little more complicated than I like, so feel free to edit or suggest improvements.
Edit
A refactored version of ZipEnumerator
:
public class ZipEnumerator<T, S, TResult> : IEnumerator<TResult>
{
IEnumerable<T> m_dataT;
IEnumerable<S> m_dataS;
IEnumerator<T> m_enumeratorT;
IEnumerator<S> m_enumeratorS;
List<IDisposable> m_disposables = new List<IDisposable>();
Func<T, S, TResult> m_selector;
bool m_secondReloaded = false;
bool m_isInitilized = false;
public ZipEnumerator(IEnumerable<T> dataT, IEnumerable<S> dataS, Func<T, S, TResult> selector)
{
m_dataT = dataT ?? throw new ArgumentNullException(nameof(dataT));
m_dataS = dataS ?? throw new ArgumentNullException(nameof(dataS));
m_selector = selector ?? throw new ArgumentNullException(nameof(selector));
}
public TResult Current => m_selector(m_enumeratorT.Current, m_enumeratorS.Current);
object IEnumerator.Current => Current;
public void Dispose()
{
DoDispose();
}
private void RegisterDisposable(IDisposable disposable)
{
m_disposables.Add(disposable);
if (m_disposables.Count > 10)
{
DoDispose();
}
}
private void DoDispose()
{
foreach (IDisposable disposable in m_disposables)
{
disposable.Dispose();
}
m_disposables.Clear();
}
private IEnumerator<T> GetTEnumerator()
{
var enumerator = m_dataT.GetEnumerator();
RegisterDisposable(enumerator);
return enumerator;
}
private IEnumerator<S> GetSEnumerator()
{
var enumerator = m_dataS.GetEnumerator();
RegisterDisposable(enumerator);
return enumerator;
}
private Func<bool> CurrentMover = null;
private bool FirstMover()
{
if (m_enumeratorT.MoveNext())
{
if (!m_enumeratorS.MoveNext())
{
m_enumeratorS = GetSEnumerator();
m_secondReloaded = true;
if (!m_enumeratorS.MoveNext())
return false;
}
return true;
}
else if (!m_secondReloaded)
{
CurrentMover = SecondMover;
return CurrentMover();
}
return false;
}
private bool SecondMover()
{
if (m_enumeratorS.MoveNext())
{
if (!m_enumeratorT.MoveNext())
{
m_enumeratorT = GetTEnumerator();
if (!m_enumeratorT.MoveNext())
return false;
}
return true;
}
return false;
}
private void Initialize()
{
m_enumeratorT = GetTEnumerator();
m_enumeratorS = GetSEnumerator();
CurrentMover = FirstMover;
m_isInitilized = true;
}
public bool MoveNext()
{
if (!m_isInitilized)
{
Initialize();
}
return CurrentMover();
}
public void Reset()
{
m_isInitilized = false;
m_secondReloaded = false;
CurrentMover = null;
m_enumeratorT = null;
m_enumeratorS = null;
DoDispose();
}
}
If you should respect that not all Enumerators
implement Reset()
then it is not possible to use using
statements for the two IEnumerators
. But you could introduce an IEnumerator<TResult>
for the zipped result and use it like this:
public static IEnumerable<TResult> ZipNew<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> resultSelector)
{
using (ZipEnumerator<TFirst, TSecond, TResult> zipEnumerator = new ZipEnumerator<TFirst, TSecond, TResult>(first, second, resultSelector))
{
while (zipEnumerator.MoveNext())
{
yield return zipEnumerator.Current;
}
}
}
In this way you're back on the using track.
The ZipEnumerator
it self could be something like:
public class ZipEnumerator<T, S, TResult> : IEnumerator<TResult>
{
IEnumerable<T> m_dataT;
IEnumerable<S> m_dataS;
IEnumerator<T> m_enumerT;
IEnumerator<S> m_enumerS;
List<IDisposable> m_disposables = new List<IDisposable>();
Func<T, S, TResult> m_selector;
bool m_secondReloaded = false;
bool m_first = true;
public ZipEnumerator(IEnumerable<T> dataT, IEnumerable<S> dataS, Func<T, S, TResult> selector)
{
m_dataT = dataT ?? throw new ArgumentNullException(nameof(dataT));
m_dataS = dataS ?? throw new ArgumentNullException(nameof(dataS));
m_selector = selector ?? throw new ArgumentNullException(nameof(selector));
}
public TResult Current => m_selector(m_enumerT.Current, m_enumerS.Current);
object IEnumerator.Current => Current;
public void Dispose()
{
foreach (IDisposable disposable in m_disposables)
{
disposable.Dispose();
}
m_disposables.Clear();
}
private IEnumerator<T> GetTEnumerator()
{
var enumerator = m_dataT.GetEnumerator();
m_disposables.Add(enumerator);
return enumerator;
}
private IEnumerator<S> GetSEnumerator()
{
var enumerator = m_dataS.GetEnumerator();
m_disposables.Add(enumerator);
return enumerator;
}
public bool MoveNext()
{
m_enumerT = m_enumerT ?? GetTEnumerator();
m_enumerS = m_enumerS ?? GetSEnumerator();
if (m_first)
{
if (m_enumerT.MoveNext())
{
if (!m_enumerS.MoveNext())
{
m_enumerS = GetSEnumerator();
m_secondReloaded = true;
if (!m_enumerS.MoveNext())
return false;
}
return true;
}
else
{
m_first = false;
}
}
if (!m_first && !m_secondReloaded)
{
if (m_enumerS.MoveNext())
{
if (!m_enumerT.MoveNext())
{
m_enumerT = GetTEnumerator();
if (!m_enumerT.MoveNext())
return false;
}
return true;
}
}
return false;
}
public void Reset()
{
m_secondReloaded = false;
m_first = true;
m_enumerT = null;
m_enumerS = null;
Dispose();
}
}
It's a little more code than other suggestions, but it encapsulates the problems with the disposal of intermediate enumerators without the necessity of a try-catch-statement. You could discuss if the disposal should be immediately when the enumerator is done or as I do collect them for disposal when the ZipEnumerator
itself is disposed off?
The MoveNext()
method went a little more complicated than I like, so feel free to edit or suggest improvements.
Edit
A refactored version of ZipEnumerator
:
public class ZipEnumerator<T, S, TResult> : IEnumerator<TResult>
{
IEnumerable<T> m_dataT;
IEnumerable<S> m_dataS;
IEnumerator<T> m_enumeratorT;
IEnumerator<S> m_enumeratorS;
List<IDisposable> m_disposables = new List<IDisposable>();
Func<T, S, TResult> m_selector;
bool m_secondReloaded = false;
bool m_isInitilized = false;
public ZipEnumerator(IEnumerable<T> dataT, IEnumerable<S> dataS, Func<T, S, TResult> selector)
{
m_dataT = dataT ?? throw new ArgumentNullException(nameof(dataT));
m_dataS = dataS ?? throw new ArgumentNullException(nameof(dataS));
m_selector = selector ?? throw new ArgumentNullException(nameof(selector));
}
public TResult Current => m_selector(m_enumeratorT.Current, m_enumeratorS.Current);
object IEnumerator.Current => Current;
public void Dispose()
{
DoDispose();
}
private void RegisterDisposable(IDisposable disposable)
{
m_disposables.Add(disposable);
if (m_disposables.Count > 10)
{
DoDispose();
}
}
private void DoDispose()
{
foreach (IDisposable disposable in m_disposables)
{
disposable.Dispose();
}
m_disposables.Clear();
}
private IEnumerator<T> GetTEnumerator()
{
var enumerator = m_dataT.GetEnumerator();
RegisterDisposable(enumerator);
return enumerator;
}
private IEnumerator<S> GetSEnumerator()
{
var enumerator = m_dataS.GetEnumerator();
RegisterDisposable(enumerator);
return enumerator;
}
private Func<bool> CurrentMover = null;
private bool FirstMover()
{
if (m_enumeratorT.MoveNext())
{
if (!m_enumeratorS.MoveNext())
{
m_enumeratorS = GetSEnumerator();
m_secondReloaded = true;
if (!m_enumeratorS.MoveNext())
return false;
}
return true;
}
else if (!m_secondReloaded)
{
CurrentMover = SecondMover;
return CurrentMover();
}
return false;
}
private bool SecondMover()
{
if (m_enumeratorS.MoveNext())
{
if (!m_enumeratorT.MoveNext())
{
m_enumeratorT = GetTEnumerator();
if (!m_enumeratorT.MoveNext())
return false;
}
return true;
}
return false;
}
private void Initialize()
{
m_enumeratorT = GetTEnumerator();
m_enumeratorS = GetSEnumerator();
CurrentMover = FirstMover;
m_isInitilized = true;
}
public bool MoveNext()
{
if (!m_isInitilized)
{
Initialize();
}
return CurrentMover();
}
public void Reset()
{
m_isInitilized = false;
m_secondReloaded = false;
CurrentMover = null;
m_enumeratorT = null;
m_enumeratorS = null;
DoDispose();
}
}
edited 1 hour ago
answered 11 hours ago
Henrik Hansen
6,1531722
6,1531722
I'd like to click +1 but when I see the variable names it says -1 so at the and it's a 0 ;-)
– t3chb0t
11 hours ago
I think the word unconventional describes them pretty good... althoughts
andss
were below that level ;-]
– t3chb0t
11 hours ago
I can explain that ;-P It's local and in a very small scope, this is allowed in my world,tt
andss
on the other hand werepublic
.
– t3chb0t
11 hours ago
add a comment |
I'd like to click +1 but when I see the variable names it says -1 so at the and it's a 0 ;-)
– t3chb0t
11 hours ago
I think the word unconventional describes them pretty good... althoughts
andss
were below that level ;-]
– t3chb0t
11 hours ago
I can explain that ;-P It's local and in a very small scope, this is allowed in my world,tt
andss
on the other hand werepublic
.
– t3chb0t
11 hours ago
I'd like to click +1 but when I see the variable names it says -1 so at the and it's a 0 ;-)
– t3chb0t
11 hours ago
I'd like to click +1 but when I see the variable names it says -1 so at the and it's a 0 ;-)
– t3chb0t
11 hours ago
I think the word unconventional describes them pretty good... although
ts
and ss
were below that level ;-]– t3chb0t
11 hours ago
I think the word unconventional describes them pretty good... although
ts
and ss
were below that level ;-]– t3chb0t
11 hours ago
I can explain that ;-P It's local and in a very small scope, this is allowed in my world,
tt
and ss
on the other hand were public
.– t3chb0t
11 hours ago
I can explain that ;-P It's local and in a very small scope, this is allowed in my world,
tt
and ss
on the other hand were public
.– t3chb0t
11 hours ago
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208152%2fcustom-implementation-of-the-linq-zip-operator-for-different-length-lists%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Your question would benefit from some example usage within the question itself, so that we can see how you intend the method to be consumed, and quickly verify that it is working as intended.
– VisualMelon
16 hours ago
@VisualMelon improved, full working code
– BWA
16 hours ago
1
Never
Reset
an enumerator. Just get a new enumerator.Reset
was a misfeature intended for COM interop scenarios.– Eric Lippert
11 hours ago