Why does the JVM not have a GIL, while the Ruby and Python interpreters do?

Question

Anonymous · Accepted Answer

It's a design decision. Either give the global lock to the running thread, or lock a resource that a thread is using. The jvm uses the latter approach. The JVM implementation of python (jython) does not introduce a GIL. I'm not sure if the same is true for JRuby.

It's a very tough decision to make. Since threads are handled differently per OS, the implementors have to prioritize between the two: a cross-platform language or a cross-platform execution environment. It's already a tremendous effort to implement a language, much more for an execution environment.

Anonymous · Answer

It's a question of whether you want to use one big, fat lock (aka GIL) or use lots of fine-grained locks. The former is easier to implement compared to the latter. If you want to know more about CPython's GIL, you should check out David Beazley's GIL talks ([1] & [2]). I think video recordings of his talks are available on Youtube as well.

Btw in CPython extensions, you're allowed to release GIL, do a bunch of computation and let other Python threads do their job. When you're done with the computation, you can reacquire the GIL and proceed as normal. This is what CPython does whenever you do blocking syscalls. Lots of extensions like NumPy release the GIL, do a bunch of computation with multiple threads. But GIL becomes a huge issue if you're trying to do lot of CPU-intensive work in pure Python code.

[1] - www.dabeaz.com/python/UnderstandingGIL.pdf
[2] - www.dabeaz.com/python/GIL.pdf

Joshua Engel · Answer

Java was designed from the beginning (more or less) to be multithreaded. A global lock makes multithreading impossible; it means that some thread can hog the system and cause resource starvation. Even system threads could potentially be starved.

GILs are simple to implement, and effective, but limited. The Java designers had much higher hopes for their threading model.

ChatGPT · Answer

The Global Interpreter Lock (GIL) is a mechanism used in some programming language interpreters, notably CPython (the reference implementation of Python) and MRI (Matz’s Ruby Interpreter), to ensure that only one thread executes Python or Ruby bytecode at a time. This simplifies memory management and eliminates the complexities of concurrent access to objects. However, it also limits the ability to fully utilize multi-core processors for CPU-bound tasks.
Reasons JVM Does Not Have a GIL:
1. Architecture and Design Philosophy:
   - The JVM was designed with a focus on enabling high-performance, multi-threaded applications. Java's concurrency model is built around threads, and the JVM provides robust support for multi-threading without a GIL.
   - The JVM leverages native operating system threads, allowing it to take advantage of multi-core architectures effectively.
2. Garbage Collection:
   - The JVM employs sophisticated garbage collection techniques that are designed to work in a concurrent environment. The garbage collector can run concurrently with application threads, which reduces the need for a GIL to manage memory safely.
   - Different garbage collection algorithms (like G1, ZGC, and Shenandoah) can operate concurrently with application threads, allowing for better performance in multi-threaded environments.
3. Synchronization Mechanisms:
   - Java provides a rich set of synchronization primitives (like [code]synchronized[/code] blocks, [code]volatile[/code] variables, and higher-level constructs from the [code]java.util.concurrent[/code] package) that allow developers to manage concurrency without a GIL.
   - These primitives are designed to work seamlessly with the underlying thread model, enabling fine-grained control over synchronization and resource access.
4. Language Features:
   - Java was designed from the ground up with multi-threading in mind, allowing developers to create highly concurrent applications. In contrast, Python and Ruby were originally designed for simpler, single-threaded applications and later adapted to support concurrency, leading to the introduction of the GIL as a way to simplify that adaptation.
5. Performance Considerations:
   - The absence of a GIL in the JVM allows for better performance in CPU-bound applications, as it can utilize multiple cores effectively. This is particularly beneficial for applications that require high throughput and low latency.
Conclusion:
In summary, the JVM does not have a GIL because it was designed to support multi-threading from the beginning, employs efficient garbage collection strategies, and provides robust synchronization mechanisms that allow for concurrent execution without the need for a GIL. In contrast, Python and Ruby's GIL arose from their need to simplify memory management and object access in a multi-threaded environment, which can lead to performance bottlenecks in multi-core scenarios.

Chris Mozambique · Answer

Python (the language) doesn't need a GIL (which is why it can perfectly be implemented on JVM [Jython] and .NET [IronPython], and those implementations multithread freely). CPython (the popular implementation) has always used a GIL for ease of coding (esp. the coding of the garbage collection mechanisms) and of integration of non-thread-safe C-coded libraries (there used to be a ton of those around;-).

The Unladen Swallow project, among other ambitious goals, does plan a GIL-free virtual machine for Python -- to quote that site, "In addition, we intend to remove the GIL and fix the state of mul

The Unladen Swallow [ http://code.google.com/p/unladen-swallow/ ] project, among other ambitious goals, does plan [ http://code.google.com/p/unladen-swallow/wiki/ProjectPlan ] a GIL-free virtual machine for Python -- to quote that site, "In addition, we intend to remove the GIL and fix the state of multithreading in Python. We believe this is possible through the implementation of a more sophisticated GC system, something like IBM's Recycler (Bacon et al, 2001)."

Aaron Mefford · Answer

A2A

The other answers here are good, but they focus on the excuses for the existence of a GIL and not so much on the why.

TL;DR

Threads are hard and error prone and true threading is not always a win, simple mistakes can have a devastating effect on performance, to the point of a deadlock which completely halts the program. The GIL avoids nearly all of this.

Why can only one thread own the interpretation routine?

restated

Why can only one thread run at a time?

When multiple threads run concurrently, they can access the same data at the same time. Consider a Python List, which is actually based on an underlying array of data. If two threads were to append to that same list at the same time, which thread wins? There is only one last bucket, and as such the thread that was a pico second behind the other would likely be the winner to the append and the data from the first thread would be lost. Even worse consider this append triggered a dynamic increase in the size of the array, which means allocate a new array that is larger than the previous, copy the data from the original and replace the original with the new copy. Now if you have two threads executing that routine at the same time, what is the result going to be. Two new arrays allocated, two copies made, and then finally the replacement, where only one of the threads wins, again with data loss. This is only one of many such situations where what seems simple becomes very complex with threading.

Simply put, two threads cannot run at one time because when they both write to the same data structures, those structures will become corrupt and data will be lost.

But Java and other languages allow it!

Very true, but if you look closely at Java and the other languages that allow true threading, you will find that those languages have different versions of the same type of Collection, one that is “Thread Safe” and another that is not. A snip from the ArrayList JavaDoc.

%3E Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements, or explicitly resizes the backing array; merely setting the value of an element is not a structural modification.)
ArrayList (Java Platform SE 8 ) [ https://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html ]

In order for Python to provide a non-GIL capability, it would need to then have to provide a “Thread-Safe” version of List, Dict, Set and any other collection that could be used. If they simply made the existing ones thread safe, then it would slow down all applications that used that collection, as the work to make a collection thread safe is not free. If they created a new data structure, that might work but then developers need to become knowledgeable about when to use which. Perhaps that isn’t to hard of a requirement, but then consider all of those existing Python libraries which you are now using which assume a GIL. Now they could no longer be used, because they would not be thread safe.

Thread Safe?

What does it mean to be thread safe? It means that the code is aware that multiple threads of execution are running concurrently, and that any action that might create a problem in that scenario, is synchronized? To be synchronized means that only one thread at a time can execute a critical block of code, code that if two threads executed, would corrupt data or otherwise cause harm. So each thread must wait for the first thread to complete before the next thread can run. In the case of the List above, the first thread to request the append, would obtain a lock. That thread would then resize the list append it’s data and release the lock. Now the second thread given the lock would see that the resize is not required and would only append it’s data, after the data of thread 1. However, just as with the GIL, thread 2 had to stop execution and wait.

Deadlocks anyone?

That is all fine and good, and could even be invisible to the developer, albeit for the inherit slowdown on every access of data. But what happens if you have two threads and two different collections. Both threads need to modify both collections. However, they are not in the same block of code. In thread 1, it first modifies collection a and then modifies collection b. In thread 2 the situation is reversed, with thread 2 first modifying collection b. In the simple scenario, each thread makes it’s modification to the first collection. However, then each thread makes modifications to the second collection without the knowledge that the other thread has already made a modification it is unaware of. While data is not lost due to our new fancy synchronized collections, data may be incorrect in both cases, because each thread assumed that the data was consistent in both collections through it’s operation. In the more complicated scenario where it is made possible to resolve the problem by allowing each thread to lock both collections before releasing the other, insuring that modifications are consistent, when thread 1 locks collection a and thread 2 locks collection b and thread 1 now needs collection b to proceed and thread 2 needs collection a you have a classic deadlock and both threads and any other threads that may need those same resources will not be able to continue. This is a simple illustration, the situations that happen in the real world can be much more complex and discrete.

Why do people use threads in Python?

Often the need for threads is created by an external resource that if waited on would slow a single threaded application to a halt. Asynchronous IO is a potential solution in such cases to remain in a single thread, but asynchronous IO can be devastatingly complex, and even when implemented correctly makes code very difficult to read. On the other hand threads in this situation can allow you to write code that is simple without blocking the entire process on the IO from the external resource. In this case there is no penalty imposed by the GIL, instead the thread will yield while it waits for IO to the other available threads in the program. When IO completes the thread will be reawakened and continue processing as though it never stopped. This is the case for network calls, disk reads, database calls, and external api calls. Often the IO is the vast majority of the time spent in a program and the GIL makes it all easy, and does not impose an undue penalty to performance.

But my workload is CPU bound with little IO!

In that case, the GIL will be an issue for using threads. However, it is not an issue for using Python. Python provides an alternative multi-process module, that uses a very similar interface to the thread interface, and allows you to easily add enough processes to maximize CPU utilization on any server. Each process get’s it’s own GIL so there is no issue with concurrency.

Vladislav Zorov · Answer

To be fair, there are GIL-less Pythons (and probably Rubies).

But, have you seen what .NET looks like on the inside, where everything is made to be fast and GIL-free? It’s a nightmare, it could only have been written by people who were getting paid for it. The GIL makes the job of the runtime, compiler and interpreter writers much easier.

P.S. Also keep in mind CPython is a reference implementation - yes, it’s a working program, but it’s also supposed to be documentation. You can’t make it efficient without also making it ugly and unsuitable to be used as documentation.

Jim Dennis · Answer

I don't know how many people use Jython ... but I'm pretty sure it's a tiny fraction of those who are using CPython and it's probably a pretty small fraction of those who are using Python with NumPy.

(NumPy/SciPy drives a significant portion of Python usage ... and Jython doesn't support it nor the rest of its ecosystem).

But, why do people use NumPy and Numba and the rest of that ecosystem if CPython's is crippled by the GIL?

Oh.  It isn't!  The performance isn't crippled by the GIL.  Portions of NumPy are compiled C and written to provide fine-grained access to release the GIL ... and also to support the SIMD [ https://en.wikipedia.org/wiki/SIMD ] features of those processes which have them.

How about other areas of CPython usage?  How often does multi-threading performance really make a difference?

Systems administration tasks (such as those using the Paramiko [ https://pypi.python.org/pypi/paramiko/1.8.0 ] (ssh) module including Fabric [ http://www.fabfile.org ] and Ansible [ http://docs.ansible.com/ansible/intro_getting_started.html ] ... and various others generally dominated by start-up times for new processes and network connection latencies.  Threading isn't usually an issue for those ... and the use of multiprocessing or distributed processes (various jobs queues using Redis and others) are generally preferable to multithreading in those applications.

I find that most questions about the Python GIL display ignorance and a bit of laziness about the issues around threading, processing, and distributed processing.  People hear just enough about the GIL to decide that it must be some sort of insurmountable issue which should discourage the use of Python (or CPython) across the board.

The fact is that there are only a narrow range of applications in which multi-threading is potentially preferable to other approaches ... and fewer where it's optimal.

On the other hand there is a huge range of applications for a programming language which is generally easy to learn, relatively easy to read, includes a broad and useful range of standard libraries (batteries are included) and has  huge collections of freely available modules. frameworks, libraries and tools.

Jython is Python.  But CPython is the tool with that huge corpus of available software (only some of which is portable to Jython). The CPython ecosystem is far more extensive than Jython's.

Also the start-up type for a Jython is practically crippling for most of the simple administrative and utility use-cases at which Python excels.  On my MacBook I can run time python -c 'import sys; sys.exit' in under three hundredths of a second (0.029 seconds) or less.  Jython generally takes closer to 2 seconds (between 1.81 and 1.99 seconds to run the same code.  (That might not seem fair ... but administrative and other command line utilities are a use case where start-up time is important and one second load times would be completely disruptive).

Tony Flury · Answer

the CPython interpreter uses a GIL because maybe counter-intuitively it offers the best performance and the design is simple for the time when it was written.

When the CPython interpreter was originally written the most heavily used threading model was single CPU intense main producer threads and multiple I/O busy consumer threads - the machines were typically single core and therefore having a GIL in this situation is fine, especially when the Interpreter itself means that there is a lot of shared data between threads all of which are reference counted. Remember everything in Python is a reference counted objects including integers and floats.

Having discrete locks on every object would be intensive and much slower: a re-implementation of CPython with discrete locks per object was around 5% slower.

To use discrete locks (such as JVM does) then a few things needs to happen :

1. Massively reduce the amount of shared reference counted objects - some recent changes and proposed changes move in that direction - sub-interpreters for example.
2. A rework of the CPython core to use discrete locks.
3. A rework of all Python libraries that manipulate the GIL.
These are all big projects and until item 3 is complete the GIL can’t disappear.

Maqsood Ali · Answer

Yes, Java can indeed be used as an implementation language for creating a Python-like programming language with its own compiler or interpreter.

Here are the key points on how this can be accomplished:

1. Lexical Analysis:

This is the first stage where the source code is divided into tokens. Java provides libraries such as ANTLR (Another Tool for Language Recognition) which can help in writing lexical analyzers.

2. Syntax Analysis (Parsing):

After tokenization, the next step is parsing, where the tokens are transformed into a syntax tree. Again, tools like ANTLR or JavaCC (Java Compiler Compiler) can be used to generate parsers.

3. Semantic Analysis:

This involves checking the syntax tree for semantic errors. This can be custom-implemented in Java by traversing the syntax tree and applying rules specific to your Python-like language.

4. Intermediate Code Generation:

In this step, the syntax tree is transformed into an intermediate representation (IR). This IR can then be optimized before generating the final output. Java can be used to create and manipulate this IR.

5. Optimization:

Various optimizations can be applied to the intermediate code to improve performance. Java’s rich set of libraries and efficient algorithms can be used to implement these optimizations.

6. Code Generation:

The intermediate code is then converted into target code. This can be bytecode (if you're building a virtual machine) or machine code (if you're building a compiler). Java can generate Java bytecode dynamically using libraries like ASM or BCEL (Byte Code Engineering Library).

7. Interpreter (if applicable):

If you decide to create an interpreter instead of (or in addition to) a compiler, Java can be used to write the interpreter that executes the IR or the source code directly.

8. Runtime Environment:

If your language requires a runtime environment, Java can be used to create this as well. This includes memory management, garbage collection, etc., leveraging Java’s own runtime capabilities.

Example Tools and Libraries:

* ANTLR:
A powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used for language development.

* JavaCC:
Another parser generator for Java that can help in building compilers and interpreters.

* ASM:
A Java library for generating and manipulating bytecode, useful if your target is Java bytecode.

* BCEL:
Another library for bytecode manipulation.

Steps in Detail:

1. Design the Language Grammar:

* Define the syntax and semantics of your Python-like language.
 * Use ANTLR or JavaCC to create a grammar file that defines the language structure.
2. Implement the Lexer and Parser:

* Use the grammar file with ANTLR/JavaCC to generate the lexer and parser.
 * Integrate the generated lexer and parser with your Java application.
3. Build the Abstract Syntax Tree (AST):

* Create classes in Java representing the nodes of your AST.
 * Use the parser to build an AST from the source code.
4. Semantic Analysis and Transformation:

* Implement visitor or traversal patterns to analyze and transform the AST.
 * Ensure type checking, scope resolution, and other semantic checks.
5. Generate Intermediate Representation (IR):

* Define an intermediate representation for your language.
 * Implement code generation to transform the AST into IR.
6. Optimize the IR:

* Apply optimization techniques to improve the IR.
 * Implement common optimizations like dead code elimination, constant folding, etc.
7. Generate Target Code:

* Convert the optimized IR into target code (e.g., Java bytecode).
 * Use libraries like ASM to generate the bytecode if targeting the JVM.
8. Implement the Runtime:

* Develop runtime support for your language (e.g., memory management, standard libraries).
 * Leverage Java's runtime features for efficient implementation.
By following these steps, you can create a Python-like language using Java as the implementation language. The choice of Java provides strong type safety, a rich set of libraries, and the ability to generate and manipulate bytecode, making it a suitable choice for building compilers and interpreters.

Phil Jones (He / Him) · Answer

There’s massive inertia in the system.

And the more software we have, and the more legacy systems we have to support and fix and extend, the slower it evolves.

It’s no different from in the 90s when everyone was asking why COBOL was still around when we had much better languages.

Well, COBOL still is around. (About five years ago, a friend of mine told me that the bank she worked at tried to replace an old COBOL system with C++ and couldn’t because C++ was too slow. Think about that for a second.)

So … finally … in the late 90s, there was a big shift. And Java caught that wave. Java was the winner at a time of huge expansion, and many enterprises deciding to finally bite the bullet and upgrade their 20 or 30 year old architectures.

It is massively locked in.

Now, where that doesn’t matter. In areas which are new and fast moving and where legacy issues don’t matter much. For example, in robotics, in mobile, on web-services. Java is losing ground to other languages.

Google just made Kotlin an official language for programming Android. I’d be interested to know why Kotlin and not Scala or Clojure. My hunch is that both those languages are seen as bigger and heavier. And perhaps Clojure is still too exotic and difficult for people. (Despite being the nicest language on offer today)

I expect slow encroachment by Clojure (and Kotlin and Scala etc.) where more and more newbuild for the JVM is done using them. But there won’t be the big wave of adoption we saw with Java.

Partly, also, because Java was heavily pushed by a corporate sponsor, Sun, whereas Clojure and Scala etc. are coming from the small companies and communities. I’d guess that Datomic don’t have salesmen going into enterprises promising that Clojure will solve all their problems, the way Sun used to promote Java.

Mark Sheldon · Answer

Erlang was designed to be concurrent and distributed from the get-go: the assumption, before the language existed, was that it would permit multiple, independent, processes running on collections of devices. So, the architecture is completely different and much more amenable to concurrent programming.

I use Python all the time, and I like it, but it’s just not good in this department. I’m not a Ruby user, but I have a friend who uses it all the time, and I’m confident he would say the same thing: it’s great when you need to whip out a useful, sequential program, and it has fantastic libraries for all kinds of things. But in Erlang (and Elixir, of course, because it also runs on the Beam), concurrent, distributed things are just much easier, because they are supported top to bottom.

Python and Ruby were designed as normal, sequential, imperative languages. Programs in these languages are thought of conventional, stateful programs, and the interpreter itself is a conventional, stateful program that carries the interpreter state and the program state. These two things are intertwined. There is a global environment that you import things into and can extend at will in a program, for example. When concurrent threads were bolted on to them, you not only have the shared state of the program threads; but all the threads are running in an interpreter that has lots of global state, and those threads are modifying their memory state and the interpreter state, too.

The GIL is a rather crude solution to this problem: one giant lock that protects all the interpreter’s state. Since so many things modify it, effectively only one thread can actually be interpreted at a time.

A giant lock was not a new idea. Some operating systems did the same thing (See Giant lock - Wikipedia [ https://en.wikipedia.org/wiki/Giant_lock ]) as they moved onto SMP systems. I recall getting a big server machine back in the 1990s that had a global file system lock that mount requests had to get. This made loop-back mounting problematic (yes we deadlocked the file system while others were using the server. Linux had a big kernel lock as it migrated first to SMP systems where user programs were concurrent, but kernel code was not. Eventually, the big kernel lock was eliminated in 2011. (Stay tuned for an echo of this story.)

The Beam, which is the modern Erlang virtual machine, was designed not to interpret a sequential language in a sequential environment. Rather it was designed to be something that implemented pre-emptive, shared nothing, concurrent processes in user space. In many ways, the Erlang runtime system (ERTS) is a mini-operating system that runs in support of an interpreter. Since the language is mostly functional and since processes don’t share anything, there is a lot less contention for shared resources. All that happens in the runtime system. Again, the interpreter was specifically designed to be run to interpret multiple concurrent processes, so they did what any good designer of a concurrent system does: eliminate as much shared state as possible and protect what you can’t eliminate, but they also designed the language semantics to make that easier. For example, there isn’t a global environment that all processes share and can add new variables to whenever they like. If it doesn’t exist, you don’t have to protect it! Each process has its own stack and heap, so there is no global garbage collection. Let me say that again: every Erlang process can run, allocate memory, and run the GC without coordinating with other Erlang processes or with the VM. Again, separate, non-shared things don’t require synchronization.

In fact, ERTS took some time to support SMP. Erlang programs ran concurrently without multi-core support until 2006. Here’s something really cool: when the runtime system started to support multicore computing, applications didn’t have to change. They just ran faster. That’s because of the insulation of the runtime system from the programs it runs — it’s more like an OS. Note the parallel here with Linux: first, a single scheduler could run multiple processes on multicore machines, though the ERTS was not itself distributed across cores; then the ERTS was eventually distributed across cores and user programs just reaped the benefits. This is what Linux would do some years later.

Today, the ERTS runs an Erlang VM scheduler per core and migrates Erlang processes from core to core based on load.

On a final note, the fact that distribution across machines was there from the start affected design decisions, too. Joe Armstrong once said, when asked why they didn’t have shared memory, something like, “in a system that runs on thousands of telephone switches all over Europe, where is the shared memory?” The idea that you can write a program that starts and monitors processes on other computers leads you to make pretty clear separations between processes and to be very clear about how they can communicate. If a process is to run without worrying about whether another process it talks to is on the same computer or one on another floor or in another city, then communication via messages is a natural choice.

If you’re interested in the gory details, look at the The Beam Book: The Erlang Runtime System [ https://happi.github.io/theBeamBook/ ]. I haven’t read the whole thing (yet), but the few times I’ve looked at it, I’ve always learned something interesting.

Anonymous · Answer

I only know a bit of Python, but I assume Ruby is similar in this regard.

Both reference counting as a memory management method, and the Global Imterpreter Lock, were design choices made primarily to keep the implementation of CPython simple. Like Linux, Python and Ruby started as private initiatives by a single person.

Today, when scaling up typically means that we use many small cores in the cloud, rather than a single, multi-core CPU, threading performance has become less of an issue, and initiatives like asyncio and the async/await syntax becomes more interesting.

Vladislav Zorov · Answer

The GIL is per-interpreter, so it doesn't limit scalability that much - you can just spawn multiple processes if you want to use multiple cores. You can then use message passing to communicate between the processes - there is an overhead because you're no longer in a shared address space, but there are some benefits like increased reliability (independent copies of data; fits FP "view of the world", languages like Erlang love it) and flexibility (ZeroMQ can work over a network, probably with no changes to code, except if you need to make your program accommodate the higher latencies of some links).

It should also be noted that many things happen in parallel despite the GIL, like I/O and code that relies on some libraries (for example NumPy, a popular Python library, does all the number crunching outside the GIL and Python - your Python program will be utilizing the CPU very well, with just small constant overheads for getting data "across the border").

About why they don't remove it, like the other answers mentioned, there are implementations without a GIL - but CPython and MRI are reference implementations, and a GIL makes the interpreter and standard library much simpler. Here is an example from Nobody understands the GIL [ http://www.jstorimer.com/blogs/workingwithcode/8085491-nobody-understands-the-gil ] that shows the same program in Ruby, JRuby and Rubinius:

[code rb]array = []

5.times.map do
  Thread.new do
    1000.times do
      array %3C%3C nil
    end
  end
end.each(&:join)

puts array.size
[/code]
[code]$ ruby pushing_nil.rb
5000

$ jruby pushing_nil.rb
4446

$ rbx pushing_nil.rb
3088
[/code]
Note the sizes reported by JRuby and Rubinius are random - you'll get a different number on every execution. The number reported by MRI will be correct every time.

The same example will probably work fine in languages that never had a GIL - like C++, Java, C#, etc., but their implementations make sure that elementary operations are atomic - this adds tons of code everywhere, from reference counting, to maintaining the size of dynamic arrays, to even utility functions like date-time operations.

Now, an unsafe interpreter isn't so tragic - you will need some synchronization for your own code anyway, so you could easily make it so only one thread adds to that array at a time. If your application happens to be bottlenecked on the GIL, and you're using a reference implementation, it might be worth it to try JRuby or IronPython.

Conclusion: For you, the GIL is bad, except if you're working on the interpreter or using the interpreter as documentation (CPython is surprisingly readable, MRI is probably the same). However, lots of applications don't bottleneck on the GIL, so you can often use the reference implementations - for the rest, there are alternative implementations without GILs.

P.S. Heavy edit thanks to User-11110385825575310654 [ https://www.quora.com/profile/User-11110385825575310654 ] :)

Chen Li · Answer

In theory, yes. JVM executes Java bytecode, and as long as you can transform the script language to proper bytecode (as javac does for Java) JVM can achieve similar performance as Java. However, in practice, this is not easy, especially for dynamic languages like Python. The first part of Charles Nutter's (author of JRuby) blog (http://blog.headius.com/2008/09/first-taste-of-invokedynamic.html?m=1) has a detailed description of this problem. The dynamic type has limited the optimizations JVM can do. In Jython (an implementation of Python on JVM) every runtime object has to be an instance of PyObject or its subclass. A simple integer addition x + y needs to first check x and y's types, then unbox x and y from PyObject to int, add x and y, and lastly box the result to a PyObject. This is definitely slower than Java which only need to perform a single int x + int y. Runtime method binding is another big factor that slows down the JVM - neither reflection nor invokedynamic is good enough performancewise.
I had been worked on another Python implementation on JVM about two years ago called Zippy (https://bitbucket.org/ssllab/zippy), which uses Oracle lab's Truffle infrastructure. During that time, we compared performance of various Python implementations on selected benchmarks, and Jython was far away from getting Java-like performance. On benchmarks we used, Jython didn't even outperform CPython. I don't know the current status of Jython, but I think they still need lots of work to get better and better performance. In my opinion, JVM is a great platform to implement high level languages. It offers GC, multithread, and other runtimes almost for free. But to achieve a peak performance as Java on JVM, there still needs a lot of work.

Alan Mellor · Answer

Perhaps they have, outside of mainstream industry.

It's common when learning computing to 'bond with' a language. Somehow, the way it expresses ideas, and the way you think about solutions lines up so well it seems like an extension of your mind.

It becomes your go to language. It has perceived and real benefits over others. We won't mention the weaknesses; we never do.

Industry doesn't care a fig.

* Can I get lots of programmers?
 * Are they cheap?
 * Is the language robust?
 * Does it have a support system of IDEs, test tools, books, tutorials?
 * Is everyone else using it?
This creates an inertia around that which is popular and 'good enough'.

It seems to change on a roughly 25 year cycle to me. In 1992, we wrote GUI apps n C, and C++ was 'the future'. The web wasn't invented, so we didn't care about web frameworks.

Maybe clojure and Scala will win, but currently, they don't seem to have that momentum behind them.

But the answer to your reasonable question is that industry trends towards lowest risk, not highest intellectual value.

Tony Flury · Answer

The Python GIL does support multi-threading - hence the multi-threading library.

What the GIL does is ensure that each Python OPCODE is atomic - i.e. two OPCODEs can’t run at the same time.

If you have two threads there is NOTHING to stop the thread scheduler running an OPCode from one thread, and then an opcode from another thread - that is exactly what multi-threading on a single CPU does - each machine code instruction is atomic.

Some opcodes can be long running, which can occasionally make multi-threading feel less reactive than some would like.

I acknowledge that the GIL currently restricts all threads from the same process to run on the same core of a multi-core CPU, and this limits the effectiveness of some multi-threading patterns (for instance using a thread to ‘delegate’ CPU heavy work) but there are other patterns that do work effectively - for instance a producer thread, and then multiple I/O heavy consumer threads; and if you need to ‘delegate’ CPU heavy work, you could use a multi-processing architecture.

However the idea that the CPython GIL prevents multi-threading is a myth.

Jim Dennis · Answer

There are several, including Jython [ http://www.jython.org/ ], IronPython [ https://wiki.python.org/moin/IronPython ], and PyPy-STM (Software Transactional Memory - PyPy documentation [ http://doc.pypy.org/en/latest/stm.html ]).

However it’s very likely that your question is misguided. The GIL is one of those issues that people will raise as an objection to Python without any serious foundation in real world software engineering and in ignorance of many ways to work at scale regardless of the global interpreter locking that’s done by (C)Python for each interpreter process.

If you implement a design which scales across multiple processes with the multiprocessing module [ https://docs.python.org/2/library/multiprocessing.html ], then your Python code can leverage the fine grained locking which has been built into a modern operating system kernel.

If you use appropriate abstractions of your interprocess (and inter-thread) communications, then your design can be implemented to run across multiple nodes and scale across clusters.

These techniques go way beyond the gains to be had by eliminating the GIL.

Also if you’re using native binary modules such as those at the core of NumPy and some other SciPy components then some of those already work around the GIL (can already scale to multiple processors without GIL contention) and some of them are capable of using CUDA and other GPU interfaces and libraries to offload much of their vector processing away from the main CPUs (where the GIL is irrelevant).

So, the question becomes: why do you think you’d see some advantage from a version of Python without the GIL?

Philippe Dunski · Answer

Hello,

the only progamlng language which doesn’t any work to be executed in informatic is… binary (procssor specific) code.

And you’ll need to ensure to access to the correct memory address where it lives to run it.

I even don’t speak about assembler, because it also requires “more work” to be understood by the processor.

I’m really speaking about an inintelligible (or like if) suit of considered as ‘1’ ( if power passes through) and ‘0’ ( if not) which is the only thing th processor can deal with…

By the way… you are completly wrong by thinking that OS can execute Ruby or Python directly…

Both are interpreted languages just like PHP or others and require an interpreter to be executed

If you don’t install the correct interpreter, you have no way to execute them.

You may have th impression that OS execute code whitout anymore requirement ( when saved in a file whith the correct extension) thanks to “file association”, which allows the OS to kniw it has to launch the correct interpreter in order to execute the file…

C++ is a compiled language. That means that you’ll need to compile again every file having changed since the last time it has been compiled.

But the final result is a file containing (processeur specific) binary instructions ready to be used by the processor.

Java is in the middle on the way between compiled and interpreted languages :

The code you wrote is “compiled” to produce an “intermediary” code which can be … interpreted by th virtual machine.

Some people can say that it gives you the best of both worlds ( compilation Vs interprétation).….

I’m not sure et t doesn’t c'est ve you the worst of them 😉

Garry Taylor · Answer

At the moment I think two main reasons:

1. Existing projects expect the GIL to be there, it would break compatibility to remove it.
2. Investment. Python doesn’t have a massive corporate sponsor like Java or C#, it’s a lot of work, and someone is going to have to pay for it.