Python code question
-
Can someone tell me why this bit of Python code is "thread-unsafe"? I'm teaching myself Python. I stumbled across http://www.koders.com/python/fid9EC20FF460960EAD9AA088D695A3D3AA7C7CC710.aspx#L95 pseudo-random number generator. Lines 69:75 are commented as being "thread-unsafe". Can someone explain to me why this is considered to be "thread-unsafe"? I've read Wikipedia's entry on "http://en.wikipedia.org/wiki/Thread_safety", but I still don't understand why the lines I mention are considered to be thread-unsafe. My assumption here is that the comment is correct, given who the person identified as being the programmer is... Thanks for any insight
-
Answer:
abulter is correct. To make it thread-safe, you'd have to lock before reading the self._seed variable, and unlock after updating it again. Or alternately, calculate the new seed, and do an atomic swap operation to update it, making sure that someone else had not already done so, which would be a lot better if you were using a language that actually supported hardware threads... It's also worth noting that the code would be fine if you were using multiple processes, assuming they initialize the seed independently.
dfriedman at Ask.Metafilter.Com Visit the source
Other answers
It's only unsafe if you have a single instance of whrandom that's being accessed from multiple threads concurrently. If two threads call the random method at the same time, they can interfere with each other. For example, suppose thread A reads self._seed at line 69, and before it does anything else, thread B executes the same line. Both threads will get the same x,y,z values, so you'll get the same random number returned from both threads. Furthermore, a thread could end up being delayed for any number of reasons, such as a page fault. If 1000 random numbers were generated in the meantime, when the old seed is stored back into the instance variable, that whole sequence will be repeated. (As bad as this is, you're protected from the worst thread-safety problems because the standard Python VM only executes a single bytecode operation at a time. Although you can't predict the order in which bytecodes from different threads are interleaved, at least there is guaranteed to be some consistent ordering. When threads can run simultaneously on multiple cores, as in most other languages, even that isn't a safe assumption.)
teraflop
Because it reads and modifies a shared instance variable, seed. If two threads called that function at once they'll step on each others toes (and probably return the same random number). It should be safe, however, to create one instance per thread and only use it in that thread context.
sbutler
It's not thread-safe because two or more threads would share the value for and would be able to write to self._seed. If thread A and thread B both called random at the same time, they would get the same seedâinstead of the next seed in the sequenceâthereby generating the same "random" number. It might be easier to understand with a simpler example:# begin programfrom threading import Threadimport timeclass Counter(object): def __init__(self): self.count = 0 def increment_count(self): current_count = self.count # do something else for a while for i in range(1000): pass self.count = current_count + 1 print self.count# calls increment_count n times on objectdef call_increment_count_n_times(obj, n): for i in range(n): obj.increment_count()if __name__ == '__main__': print "no threading" c = Counter() for i in range(10): c.increment_count() # make two threads, each calls increment_count five times print "" print "with two threads" c = Counter() t1 = Thread(target=call_increment_count_n_times, args=(c, 5)) t2 = Thread(target=call_increment_count_n_times, args=(c, 5)) t1.start() t2.start()# end programRun this program at the command line and you'll get output like this: no threading 1 2 3 4 5 6 7 8 9 10 with two threads 1 2 3 3 4 5 4 5 6 7 (I've cleaned up the threaded output a little bitâstdout gets garbled when two threads are writing to it simultaneously. You might get different numbers.) As you can see, the threaded example never makes it to ten. The issue here is that thread t1 copies the value of self.count to current_count, does something else for a while, then writes back to self.count with the value of its copy plus one. Meanwhile, t2 is doing the same thing. If t1 incremented the value of self.count while t2 was in its for loop, t2 doesn't know about it, and happily sets the value of self.count to what it thinks is the appropriate value, overwriting whatever t1 might have set it to. Once you've worked with multithreaded programming for a while, you learn to spot stuff like this a mile a way. The most basic rule of thumb is that two threads should never share write access to a variable. You can use locks to guarantee that only one thread has access to a variable at once, but it's often a better idea to use a http://docs.python.org/library/queue.html to communicate between threads instead.
aparrish
The author states it pretty clearly in the comment, but the reason is that it uses the current time as part of a seed for the random value (see t in the seed() method), making it possible that two calls to seed() will use the same values and yield the same results (thereby breaking randomness). Or as the comment puts it "the random number generator used here is not thread-safe; it is possible that nearly simultaneous calls in different theads return the same random value." Nearly simultaneous meaning, the value of time.time() has finite precision, the difference of 1 in the furthestmost decimal position is the interval of time required before the same x, y, z initiation values can yield new pseudorandom results.
Matt Oneiros
The above is correct. Added for emphasis: If you aren't using multiple threads, the thread safety warning doesn't apply. If you are using multiple threads, but are only using a given instance of whrandom within a single thread, the thread safety doesn't apply. However, if you are using multiple threads, and it is possible for more than one thread to ask a whrandom instance for a random number, then you need to wrap access to whrandom.random() so that it protected by a lock. There's an example of how to do that http://effbot.org/zone/thread-synchronization.htm.
dws
This is all very helpful, thanks!
dfriedman
Related Q & A:
- What are the common usage of python programming language?Best solution by quora.com
- What is the difference between a static method and class method in Python?Best solution by pythoncentral.io
- How to find the embed code for videos on a Website when it doesn't show in the source code?Best solution by Stack Overflow
- How to remove your question from yahoo question/answers?Best solution by Yahoo! Answers
- How to remove a question in the ask question editor?Best solution by Meta Stack Overflow
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.