python - Do all iterators cache? How about csv.Reader? -
we know following code loading data line-by-line rather loading them in memory. i.e. line alread read somehow marked 'deletable' os
def filegen( file ): line in file: yield line open("somefile") file: line in filegen( file ): print line
but there anyway verify if still true if modify definition of filegen following?
def filegen( file ): line in csv.reader( file ): yield line
how know if csv.reader cache data loaded?
regards, john
the reliable way find out csv.reader
doing read source. see _csv.c
, lines 773 onwards. you'll see reader object has pointer underlying iterator (typically file iterator), , calls pyiter_next
each time needs line. not read ahead or otherwise cache data loads.
another way find out csv.reader
doing make mock file object can report when being queried. example:
class mockfile: def __init__(self): self.line = 0 def __iter__(self): return self def next(self): self.line += 1 print "mockfile line", self.line return "line,{0}".format(self.line) >>> r = csv.reader(mockfile()) >>> next(r) mockfile line 1 ['line', '1'] >>> next(r) mockfile line 2 ['line', '2']
this confirms learned reading csv
source code: requests next line underlying iterator when own next
method called.
john made clear (see comments) concern whether csv.reader
keeps lines alive, preventing them being collected python's memory manager.
again, can either read code (most reliable) or try experiment. if @ implementation of reader_iternext
in _csv.c
, you'll see lineobj
name given object returned underlying iterator, , there's call py_decref(lineobj)
on every path through code. csv.reader
not keep lineobj
alive.
here's experiment confirm that.
class finalizablestring(string): """a string reports deletion.""" def __init__(self, s): self.s = s def __str__(self): return self.s def __del__(self): print "*** deleting", self.s class mockfile: def __init__(self): self.line = 0 def __iter__(self): return self def next(self): self.line += 1 return finalizablestring("line,{0}".format(self.line)) >>> r = csv.reader(mockfile()) >>> next(r) *** deleting line,1 ['line', '1'] >>> next(r) *** deleting line,2 ['line', '2']
so can see csv.reader
not hang on objects gets iterator, , if nothing else keeping them alive, garbage-collected in timely fashion.
i have feeling there's more question you're not telling us. can explain why worried this?
Comments
Post a Comment