TAGS :Viewed: 6 - Published at: a few seconds ago

I needed to process a config file just now. Because of the way it was generated, it contains lines like this:

---(more 15%)---

The first step is to strip these unwanted lines out. As a slight twist, each of these lines is followed by a blank line, which I also want to strip. I created a quick Python script to do this:

skip_next = False
for line in sys.stdin:
    if skip_next:
        skip_next = False
        continue    
    if line.startswith('---(more'):
        skip_next = True
        continue    
    print line,

Now, this works, but it's more hacky than I'd hoped. The difficulty is that when looping through the lines, we want the content of one line to affect the subsequent line. Hence my question: What's an elegant way for one loop iteration to affect another?

Answer 1


The reason this feels awkward is that you're fundamentally Doing It Wrong. A for loop is supposed to be a sequential iteration over each element of a series. If you're doing something that's calling continue without even looking at the current element, based on something that happened in a previous element of a series, you're breaking that basic abstraction. You're then introducing awkwardness with the extra moving parts required to take care of the square-peg-in-round-hole solution you're setting up.

Instead, try keeping the action close to the condition that causes it. We know that a for loop is just syntactic sugar for a special case of a while loop, so let's use that. Pseudocode, since I'm not familiar with Python's I/O subsystem:

while not sys.stdin.eof: //or whatever
    line = sys.stdin.ReadLine()
    if line.startswith('---(more'):
        sys.stdin.ReadLine() //read the next line and ignore it
        continue    
    print line

Answer 2


Another way to do this is with itertools.tee, which allows you to split the iterator into two. You can then advance one iterator by one step, putting one iterator one line ahead of the other. You can then zip up the two iterators and look at both the previous line and the current line at each step of the for loop (I use izip_longest so it doesn't drop the last line):

from itertools import tee, izip_longest
in1, in2 = tee(sys.stdin, 2)
next(in2)
for line, prevline in izip_longest(in1, in2, fillvalue=''):
    if line.startswith('---(more') or prevline.startswith('---(more'):
        continue
    print line

This could also be done as an equivalent generator expression:

from itertools import tee, izip_longest
in1, in2 = tee(sys.stdin, 2)
next(in2)
pairs = izip_longest(in1, in2, fillvalue='')
res = (line for line, prevline in pairs
       if not line.startswith('---(more') and not prevline.startswith('---(more'))
for line in res:
    print line

Or you could use filter, which allows you to drop iterator items when a condition is not true.

from itertools import tee, izip_longest
in1, in2 = tee(sys.stdin, 2)
next(in2)
pairs = izip_longest(in1, in2, fillvalue='')
cond = lambda pair: not pair[0].startswith('---(more') and not pair[1].startswith('---(more')
res = filter(cond, pairs)
for line in res:
    print line

If you are willing to go outside the python standard library, the toolz package makes this even easier. It provides a sliding_window function, which allows you to split up an iterator such as a b c d e f into something like (a,b), (b,c), (c,d), (d,e), (e,f). This does basically the same thing as the tee approach above, it just combined three lines into one:

from toolz.itertoolz import sliding_window
for line, prevline in sliding_wind(2, sys.stdin):
    if line.startswith('---(more') or prevline.startswith('---(more'):
        continue
    print line

you could additionally use remove, which is basically the opposite of filter, to drop the items without needing a for loop:

from tools.itertoolz import sliding_window, remove
pairs = sliding_window(2, sys.stdin)
cond = lambda x: x[0].startswith('---(more') or x[1].startswith('---(more')
res = remove(cond, pairs)
for line in res:
    print line

Answer 3


In this case, we can skip a line by manually advancing the iterator. This results in code that is somewhat similar to Mason Wheeler's solution, but still uses the iteration syntax. There is a related Stack Overflow question:

for line in sys.stdin:
    if line.startswith('---(more'):
        sys.stdin.next()
        continue    
    print line,