Python: How To Use A Generator To Avoid Sql Memory Issue
I have following method that access mysql database and the query get executed in a server that I don't have any access to change anything on regarding increasing memory. I am new t
Solution 1:
With MySQLdb, the default cursor loads the entire result set into a Python list when the call to cursor.execute(..)
is made. For a large query that may cause a MemoryError whether or not you use a generator.
Instead, use an SSCursor or SSDictCursor. These will keep the result set on the server side, and allow you to interate through the items in the result set on the client side:
import MySQLdb
import MySQLdb.cursors as cursors
import traceback
defgetUNames(self):
# You may of course want to define `self.gdbCursor` somewhere else...
conn = MySQLdb.connect(..., cursorclass=cursors.SSCursor)
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^# Set the cursor class to SSCursor here
self.gdbCursor = conn.cursor()
globalUserQuery = ur'''SELECT gu_name FROM globaluser WHERE gu_locked = 0'''try:
self.gdbCursor.execute(globalUserQuery)
for row in self.gdbCursor:
uName = unicode(row['gu_name'], 'utf-8')
yield uName
except Exception as e:
traceback.print_exc()
There isn't much documentation on the difference between the default Cursor
and the SSCursor
. The best source I know is the docstrings of the Cursor Mixin classes themselves:
The default cursor uses a CursorStoreResultMixIn
:
In [2]: import MySQLdb.cursors as cursors
In [8]: print(cursors.CursorStoreResultMixIn.__doc__)
This is a MixIn class which causes the entire result setto be
stored on the client side, i.e. it uses mysql_store_result(). If the
result set can be very large, consider adding a LIMIT clause to your
query, orusing CursorUseResultMixIn instead.
and the SSCursor uses a CursorUseResultMixIn
:
In [9]: print(cursors.CursorUseResultMixIn.__doc__)
This is a MixIn class which causes the resultsetto be stored
in the server and sent row-by-rowto client side, i.e. it uses
mysql_use_result(). You MUST retrieve the entire resultsetandclose() the cursor before additional queries can be peformed on
the connection.
Since I changed getUNames
into a generator, it would be used like this:
forrowinself.getUnames():
...
Post a Comment for "Python: How To Use A Generator To Avoid Sql Memory Issue"