python - Do all iterators cache? How about csv.Reader? -


we know following code loading data line-by-line rather loading them in memory. i.e. line alread read somehow marked 'deletable' os

def filegen( file ):     line in file:         yield line  open("somefile") file:     line in filegen( file ):         print line 

but there anyway verify if still true if modify definition of filegen following?

def filegen( file ):     line in csv.reader( file ):         yield line 

how know if csv.reader cache data loaded?

regards, john

the reliable way find out csv.reader doing read source. see _csv.c, lines 773 onwards. you'll see reader object has pointer underlying iterator (typically file iterator), , calls pyiter_next each time needs line. not read ahead or otherwise cache data loads.

another way find out csv.reader doing make mock file object can report when being queried. example:

class mockfile:     def __init__(self): self.line = 0     def __iter__(self): return self     def next(self):         self.line += 1         print "mockfile line", self.line         return "line,{0}".format(self.line)  >>> r = csv.reader(mockfile()) >>> next(r) mockfile line 1 ['line', '1'] >>> next(r) mockfile line 2 ['line', '2'] 

this confirms learned reading csv source code: requests next line underlying iterator when own next method called.


john made clear (see comments) concern whether csv.reader keeps lines alive, preventing them being collected python's memory manager.

again, can either read code (most reliable) or try experiment. if @ implementation of reader_iternext in _csv.c, you'll see lineobj name given object returned underlying iterator, , there's call py_decref(lineobj) on every path through code. csv.reader not keep lineobj alive.

here's experiment confirm that.

class finalizablestring(string):     """a string reports deletion."""     def __init__(self, s): self.s = s     def __str__(self): return self.s     def __del__(self): print "*** deleting", self.s  class mockfile:     def __init__(self): self.line = 0     def __iter__(self): return self     def next(self):         self.line += 1         return finalizablestring("line,{0}".format(self.line))  >>> r = csv.reader(mockfile()) >>> next(r) *** deleting line,1 ['line', '1'] >>> next(r) *** deleting line,2 ['line', '2'] 

so can see csv.reader not hang on objects gets iterator, , if nothing else keeping them alive, garbage-collected in timely fashion.


i have feeling there's more question you're not telling us. can explain why worried this?


Comments

Popular posts from this blog

c# - SharpSVN - How to get the previous revision? -

c++ - Is it possible to compile a VST on linux? -

url - Querystring manipulation of email Address in PHP -