How do you close connections and reuse connections with urllib2?
-
This is a two part question. The first is how do you properly close a connection with urllib2? I have seen a number of examples and I have adopted the best solution I could find. However, there appears to be a problem closing the files. Currently I use contextlib's closing() as follows: try: with closing(self.opener.open(self.address, None, self.timeout)) as page: self.data = page.read() except: # bail out.. However, I still get "too many open files" error after a long time on OSX. I used ulimit to increase files to 2000 and above. I also set the kernel's max files up to >40,000. I should note, the object this method is located is not disposed of and it remains around for the life of the program. However, I only keep the "data" stored in the object along with the address and timeout. I don't keep the file-like object stored. I thought the problem may be references but I don't believe so because I never store a reference to the file-like object directly, only the data from read(). These objects are reused and reloaded with new data each time a thread pulls a url out of the Queue. I only open roughly 50 connections at a time. I don't quite understand how I could run out of files. Also, when I run out of files netstat begins to crap out with malloc errors: netstat(439) malloc: *** mmap(size=18446744073708605440) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug netstat: malloc 18446744073708605396 bytes: Cannot allocate memory I also can't find a way to reset the connections and get netstat back to normal without shutting down. netstat -m $ netstat -m 475/3803 mbufs in use: 475 mbufs allocated to data 3328 mbufs allocated to caches 407/3814 mbuf 2KB clusters in use 0/577 mbuf 4KB clusters in use 0/12 mbuf 16KB clusters in use 11242 KB allocated to network (8.3% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to drain routines I am having trouble locating the error but I believe the connections are not being closed in a timely manner and I am well aware the connections are not being re-used even when connecting to a single domain (I would like that). That is the second part of the question. How can someone reuse a connection with urllib2? I have multiple threads getting urls from a Queue and each retrieve the data via this kind of routine. If possible, I would like to reuse the connection if it has been opened by another thread. The only data shared between threads is the URL queue. I have looked at other modules but they appear to need more sharing of data than just a url.
-
Answer:
I would recommend dropping urllib2, and trying out the fantastic http://docs.python-requests.org library. It automatically takes care of reusing and closing connections. (http://docs.python-requests.org/en/latest/user/advanced/#keep-alive) You might also be interested in its support for making asynchronous requests. (http://docs.python-requests.org/en/latest/user/advanced/#asynchronous-requests)
Corey at Stack Overflow Visit the source
Related Q & A:
- How to drop all connections to a specific database in postgres without stopping the server?Best solution by Database Administrators
- How to handle database connections?Best solution by Stack Overflow
- How to enable remote connections in SQL Server 2008?Best solution by Server Fault
- How can I close my e-mail address?Best solution by Yahoo! Answers
- How do you remove "Connections" on your Yahoo! Profile?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.