For a language that announced itself (and raised a lot of money on the premise of) claiming to be "a Python superset", this does not sound like a huge achievement.
In all fairness, their website now reads: "Mojo is a pythonic language for blazing-fast CPU+GPU execution without CUDA. Optionally use it with MAX for insanely fast AI inference."
So I suppose now is just a compiled language with superficially similar syntax and completely different semantics to Python?
I think it was pretty clear immediately that running python code was a far away goal. There was a lot more talk about lifetimes and ownership semantics than details about Python interop. Mojo is more like: Can we take the learnings of Swift and Rust and solve the usability and compile time issues, while building on MLIR to target arbitrary architectures efficiently (and call it a Python superset to raise VC money).
That said, the upside is huge. If they can get to a point where Python programmers that need to add speed learn Mojo, because it feels more familiar and interops more easily, rather than C/CPP that would be huge. And it's a much lower bar than superset of python.
It marketed itself explicitly as a "Python superset", which could allow Python programmers to avoid learning a second language and write performant code.
I'd argue that I am not sure what kind of Python programmer is capable of learning things like comptime, borrow checking, generics but would struggle with different looking syntax. So to me this seemed like a deliberate misrepresentation of the actual challenges to generate hype and marketing.
Which fair enough, I suppose this is how things work. But it should be _fair_ to point out the obvious too.
> For a language that announced itself (and raised a lot of money on the premise of)
claiming to be "a Python superset", this does not sound like a huge achievement
I feel like that depends quite a lot on what exactly is in the non-subset part of the language. Being able to use a library from the superset in the subset requires being able to translate the features into something that can run in the subset, so if the superset is doing a lot of interesting things at runtime, that isn't necessarily going to be trivial.
(I have no idea exactly what features Mojo provides beyond what's already in Python, so maybe it's not much of an achievement in this case, but my point is that this has less to do with just being a superset but about what exactly the extra stuff is, so I'm not sure I buy the argument that the marketing you mention of enough to conclude that this isn't much of an achievement.)
The real unique selling point of Mojo is "CPU+GPU execution without CUDA", specifically, you write code that looks like code without worrying about distinctions like kernels and device functions and different ways of writing code that runs on GPU vs. code that runs on CPU, and mojo compiles it to those things.
It was never going to have Python semantics and be fast. Python isn't slow because of a lack of effort or money, it's slow because of all the things happening in the interpreter.
Given that NVidia now decided to get serious with Python JIT DSLs in CUDA as announced at GTC 2025, I wonder how much mindshare Mojo will managed win across researchers.
There is also Julia, as the black swan many outside Python community have moved into, with much more mature tooling, and first tier Windows support, for those researchers that for whatever reason have Windows issued work laptops.
Mojo as programming language seems interesting as language nerd, but I think the judge is still out there if this is going to be another Swift, or Swift for Tensorflow, in regards to market adoption, given the existing contenders.
Mojo (and Modular's whole stack) is pretty much completely focused at people who are interested in inference, not training nor research so much at this moment.
So going after people who need to build low latency high-throughput inference systems.
Also as someone else pointed out, they also target all kinds of hardware, not just NVidia.
Currently looks more like CPUs and eventually AMD, from what I have been following up on their YouTube sessions, and whole blog post series about freedom from NVidia and such.
Chris Lattner (the tech lead behind Mojo, LLVM, Clang, Swift and MLIR) appeared on a podcast a bit over a week ago and discussed the state of Mojo and where it is going.
He also discussed open sourcing Mojo and where the company expects to make its money.
I am not that intrigued that Python that can call some pre-compiled functions, this is already possible with any language that produces a dynamic library.
The space that I am interested in is execution time compiled programs. A usecase of this is to generate a perfect hash data structure. Say you have a config file that lists out the keywords that you want to find, and then dynamically generate the perfect hash data structure compiled as if those keywords are compile time values (because they are).
Or, if the number of keywords is too small, fallback to a linear search method. All done in compile time without the cost of dynamic dispatch.
Of course, I am talking about numba. But I think it is cursed by the fact that the host language is Python. Imagine if Python is stronger typed, it would open up a whole new scale of optimization.
I would rather image CPython being like Common Lisp, Scheme, Raket, Smalltalk, Self compilation model.
Sadly the contenders on the corner get largely ignored, so we need to contend with special cased JIT DSLs, or writing native extensions, as in many cases CPython is only implementation that is available.
> I am not that intrigued that Python that can call some pre-compiled functions, this is already possible with any language that produces a dynamic library.
> The space that I am interested in is execution time compiled programs. A usecase of this is to generate a perfect hash data structure. Say you have a config file that lists out the keywords that you want to find, and then dynamically generate the perfect hash data structure compiled as if those keywords are compile time values (because they are).
I'm not sure I understand you correctly, but these two seem connected. If I were to do what you want to do here in Python I'd create a zig build-lib and use it with ctypes.
Can Zig recompile itself if I change a config in production? I am talking about this
```
python program.py --config <change this>
```
It is basically a recompilation of the whole program at every execution taking into account the config/machine combination.
So if the config contains no keyword for lookup, then the program should be able to be compiled into a noop. Or if the config contains keyword that permits a simple perfect hash algorithm, then it should recompile itself to use that mechanism.
I dont think any of the typical systems programming allows this.
I am really rooting for Mojo. I love what the language is trying to do, and making it easier to run SOTA AI workloads on hardware that isn't Nvidia + CUDA will open up all kinds of possibilities.
I'm just nervous how much VC funding they've raised and what kind of impacts that could have on their business model as they mature.
If they can manage to make good on their plans to open-source it, I'll breathe a tentative sigh of relief. I'm also rooting for them, but until they're open-source, I'm not willing to invest my own time into their ecosystem.
They already released their code under the Apache 2.0 license. Not everything in their stack is open source but the core things appear to be open source.
I've never been thats sold on Mojo, I think I'm unfairly biased away from it because I find new languages interesting, and its big sell is changing as little as possible from an existing language.
That said, importing into Python this easily is a pretty big deal. I can see a lot of teams who just want to get unblocked by some performance thing, finding this insanely helpful!
> its big sell is changing as little as possible from an existing language.
This is not really true. Even though Mojo is adopting Python's syntax, it is a drastically different language under the hood. Mojo is innovating in many directions (eg: mlir integration, ownership model, comptime, etc). The creators didn't feel the need to innovate on syntax in addition to all that.
You're right- I probably should have said something like "part of its sell" or "one of its selling points" or something.
I didn't mean to undermine the ambitious goals the project has. I still wish it was a little bolder on syntax though, Python is a large and complex language as is, so a superset of Python is inherently going to be a very complicated language.
> as I'm definitely in the market for a simple compiled language that can offer Python some really fast functions
So, Nim? https://github.com/yglukhov/nimpy
The real point of Mojo is not the language, it's the deep roots into MLIR which is an attempt to do what LLVM did for compilers, and do it on GPUs / ML hardware. Chris Lattner is leading the project and he created LLVM and MLIR.
My guess is that the slight overhead of interacting with mojo led to this speed discrepancy, and if a higher factorial (that was within the overflow limits etc) was run, this overhead would become negligible (as seen by the second example). Also similar to jax code being slower than numpy code for small operations, but being much faster for larger ones on cpus etc.
> Functions taking more than 3 arguments. Currently PyTypeBuilder.add_function() and related function bindings only support Mojo functions that take up to 3 PythonObject arguments: fn(PythonObject, PythonObject, PythonObject).
Lol wut. For the life of me I cannot fathom what design decision in their cconv/ABI leads to this.
There was a similar pattern in the Guava library years ago, where ImmutableList.of(…) would only support up to 20 arguments because there were 20 different instances of the method for each possible argument count.
I’m someone who should be really excited about this, but I fundamentally don’t believe that a programming language can succeed behind a paywall or industry gatekeeper.
I’m always disappointed when I hear anything about mojo. I can’t even fully use it, to say nothing of examine it.
We all need money, and like to have our incubators, but the LLVM guy thinks like Jonathan Blow with jai?
I don’t see the benefit of joining an exclusive club to learn exclusively-useful things. That sounds more like a religion or MLM than anything RMS ever said :p
For a language that announced itself (and raised a lot of money on the premise of) claiming to be "a Python superset", this does not sound like a huge achievement.
In all fairness, their website now reads: "Mojo is a pythonic language for blazing-fast CPU+GPU execution without CUDA. Optionally use it with MAX for insanely fast AI inference."
So I suppose now is just a compiled language with superficially similar syntax and completely different semantics to Python?
I think it was pretty clear immediately that running python code was a far away goal. There was a lot more talk about lifetimes and ownership semantics than details about Python interop. Mojo is more like: Can we take the learnings of Swift and Rust and solve the usability and compile time issues, while building on MLIR to target arbitrary architectures efficiently (and call it a Python superset to raise VC money).
That said, the upside is huge. If they can get to a point where Python programmers that need to add speed learn Mojo, because it feels more familiar and interops more easily, rather than C/CPP that would be huge. And it's a much lower bar than superset of python.
It marketed itself explicitly as a "Python superset", which could allow Python programmers to avoid learning a second language and write performant code.
I'd argue that I am not sure what kind of Python programmer is capable of learning things like comptime, borrow checking, generics but would struggle with different looking syntax. So to me this seemed like a deliberate misrepresentation of the actual challenges to generate hype and marketing.
Which fair enough, I suppose this is how things work. But it should be _fair_ to point out the obvious too.
They've backed off a little from the Python superset claims and leaned more into "Python family".
> I'd argue that I am not sure what kind of Python programmer is capable of learning things like comptime, borrow checking
One who previously wrote compiled languages ;-). It's not like you forget everything you know once you touch Python.
> and call it a Python superset to raise VC money
What else was proclaimed just to raise VC money?
> For a language that announced itself (and raised a lot of money on the premise of) claiming to be "a Python superset", this does not sound like a huge achievement
I feel like that depends quite a lot on what exactly is in the non-subset part of the language. Being able to use a library from the superset in the subset requires being able to translate the features into something that can run in the subset, so if the superset is doing a lot of interesting things at runtime, that isn't necessarily going to be trivial.
(I have no idea exactly what features Mojo provides beyond what's already in Python, so maybe it's not much of an achievement in this case, but my point is that this has less to do with just being a superset but about what exactly the extra stuff is, so I'm not sure I buy the argument that the marketing you mention of enough to conclude that this isn't much of an achievement.)
The real unique selling point of Mojo is "CPU+GPU execution without CUDA", specifically, you write code that looks like code without worrying about distinctions like kernels and device functions and different ways of writing code that runs on GPU vs. code that runs on CPU, and mojo compiles it to those things.
I believe they're still working towards making the syntax and semantics more python-like.
It was never going to have Python semantics and be fast. Python isn't slow because of a lack of effort or money, it's slow because of all the things happening in the interpreter.
Given that NVidia now decided to get serious with Python JIT DSLs in CUDA as announced at GTC 2025, I wonder how much mindshare Mojo will managed win across researchers.
"1001 Ways to Write CUDA Kernels in Python"
https://www.youtube.com/watch?v=_XW6Yu6VBQE
"The CUDA Python Developer’s Toolbox"
https://www.nvidia.com/en-us/on-demand/session/gtc25-S72448/
"Accelerated Python: The Community and Ecosystem"
https://www.youtube.com/watch?v=6IcvKPfNXUw
"Tensor Core Programming in Python with CUTLASS 4.0"
https://www.linkedin.com/posts/nvidia-ai_python-cutlass-acti...
There is also Julia, as the black swan many outside Python community have moved into, with much more mature tooling, and first tier Windows support, for those researchers that for whatever reason have Windows issued work laptops.
https://info.juliahub.com/industries/case-studies
Mojo as programming language seems interesting as language nerd, but I think the judge is still out there if this is going to be another Swift, or Swift for Tensorflow, in regards to market adoption, given the existing contenders.
Mojo (and Modular's whole stack) is pretty much completely focused at people who are interested in inference, not training nor research so much at this moment.
So going after people who need to build low latency high-throughput inference systems.
Also as someone else pointed out, they also target all kinds of hardware, not just NVidia.
Currently looks more like CPUs and eventually AMD, from what I have been following up on their YouTube sessions, and whole blog post series about freedom from NVidia and such.
They also miss CPUs on Windows, unless using WSL.
Mojo is marketed as a way to get maximum hardware performance on any hardware, not just nvidia.
This may appeal to people wanting to run their code on different hardware brands fro various reasons.
True, however that goal is not yet available today, it doesn't even run on Windows natively.
And for those that care, Julia is available today on different hardware brands, as there are other Python DSL JITs as well.
I agree they will get there, now the question is will they get there fast enough to matter, versus what the mainstream market cares about.
The limitations of DSLs and the pull of Python make it a practical sweet spot I think if they manage to get the Python compatibility up to par.
Chris Lattner (the tech lead behind Mojo, LLVM, Clang, Swift and MLIR) appeared on a podcast a bit over a week ago and discussed the state of Mojo and where it is going.
He also discussed open sourcing Mojo and where the company expects to make its money.
https://www.youtube.com/watch?v=04_gN-C9IAo
That video is restricted, do you have a public link?
Doesn't seem to be restricted. Do you mean region locked?
Just worked fine for me.
I'm logged into youtube, if you're not something with that perhaps?
Sorry, but I'm not having any issue with that link.
Here is a link to the episode listing for that podcast, which might help.
https://www.youtube.com/@LatentSpacePod
The factorial test giving zero on majo suggests they aren't doing arbitrary precision integer arithmetic.
I liked mojo as a python superset. Wanted to be able to run arbitrary python through it and selectively change parts to use the new stuff.
A "pythonic language" sounds like that goal has been dropped, at which point the value prop is much less clear to me.
They explicitly casted it to an 'Int' on the mojo side, but the modular website claims that isnt a specific bit-width so i am surprised
I am not that intrigued that Python that can call some pre-compiled functions, this is already possible with any language that produces a dynamic library.
The space that I am interested in is execution time compiled programs. A usecase of this is to generate a perfect hash data structure. Say you have a config file that lists out the keywords that you want to find, and then dynamically generate the perfect hash data structure compiled as if those keywords are compile time values (because they are).
Or, if the number of keywords is too small, fallback to a linear search method. All done in compile time without the cost of dynamic dispatch.
Of course, I am talking about numba. But I think it is cursed by the fact that the host language is Python. Imagine if Python is stronger typed, it would open up a whole new scale of optimization.
I would rather image CPython being like Common Lisp, Scheme, Raket, Smalltalk, Self compilation model.
Sadly the contenders on the corner get largely ignored, so we need to contend with special cased JIT DSLs, or writing native extensions, as in many cases CPython is only implementation that is available.
> I am not that intrigued that Python that can call some pre-compiled functions, this is already possible with any language that produces a dynamic library.
> The space that I am interested in is execution time compiled programs. A usecase of this is to generate a perfect hash data structure. Say you have a config file that lists out the keywords that you want to find, and then dynamically generate the perfect hash data structure compiled as if those keywords are compile time values (because they are).
I'm not sure I understand you correctly, but these two seem connected. If I were to do what you want to do here in Python I'd create a zig build-lib and use it with ctypes.
Can Zig recompile itself if I change a config in production? I am talking about this
``` python program.py --config <change this> ```
It is basically a recompilation of the whole program at every execution taking into account the config/machine combination.
So if the config contains no keyword for lookup, then the program should be able to be compiled into a noop. Or if the config contains keyword that permits a simple perfect hash algorithm, then it should recompile itself to use that mechanism.
I dont think any of the typical systems programming allows this.
I am really rooting for Mojo. I love what the language is trying to do, and making it easier to run SOTA AI workloads on hardware that isn't Nvidia + CUDA will open up all kinds of possibilities.
I'm just nervous how much VC funding they've raised and what kind of impacts that could have on their business model as they mature.
If they can manage to make good on their plans to open-source it, I'll breathe a tentative sigh of relief. I'm also rooting for them, but until they're open-source, I'm not willing to invest my own time into their ecosystem.
They already released their code under the Apache 2.0 license. Not everything in their stack is open source but the core things appear to be open source.
For anyone not up to speed on Mojo: Mojo is a pythonic language for blazing-fast CPU+GPU execution without CUDA:
https://news.ycombinator.com/item?id=35790367
I've never been thats sold on Mojo, I think I'm unfairly biased away from it because I find new languages interesting, and its big sell is changing as little as possible from an existing language.
That said, importing into Python this easily is a pretty big deal. I can see a lot of teams who just want to get unblocked by some performance thing, finding this insanely helpful!
> its big sell is changing as little as possible from an existing language.
This is not really true. Even though Mojo is adopting Python's syntax, it is a drastically different language under the hood. Mojo is innovating in many directions (eg: mlir integration, ownership model, comptime, etc). The creators didn't feel the need to innovate on syntax in addition to all that.
You're right- I probably should have said something like "part of its sell" or "one of its selling points" or something.
I didn't mean to undermine the ambitious goals the project has. I still wish it was a little bolder on syntax though, Python is a large and complex language as is, so a superset of Python is inherently going to be a very complicated language.
> as I'm definitely in the market for a simple compiled language that can offer Python some really fast functions So, Nim? https://github.com/yglukhov/nimpy
The real point of Mojo is not the language, it's the deep roots into MLIR which is an attempt to do what LLVM did for compilers, and do it on GPUs / ML hardware. Chris Lattner is leading the project and he created LLVM and MLIR.
I hope this ends up superseding Cython
I think that's one of the real strong use cases I see coming up.
If they can make calling Mojo from Python smooth it would be a great replacement for Cython. You also then get easy access to your GPU etc.
Same, and I literally started Cython. :-)
Are labels on first output misplaced or mojo was actually slower?
’’’ 3628800 Time taken: 3.0279159545898438e-05 seconds for mojo
3628800 Time taken: 5.0067901611328125e-06 seconds for python ’’’
My guess is that the slight overhead of interacting with mojo led to this speed discrepancy, and if a higher factorial (that was within the overflow limits etc) was run, this overhead would become negligible (as seen by the second example). Also similar to jax code being slower than numpy code for small operations, but being much faster for larger ones on cpus etc.
> Functions taking more than 3 arguments. Currently PyTypeBuilder.add_function() and related function bindings only support Mojo functions that take up to 3 PythonObject arguments: fn(PythonObject, PythonObject, PythonObject).
Lol wut. For the life of me I cannot fathom what design decision in their cconv/ABI leads to this.
I was wondering if it is function overloads of the same name, defining it non-variadically:
add_function(PythonObject)
add_function(PythonObject, PythonObject)
add_function(PythonObject, PythonObject, PythonObject)
There was a similar pattern in the Guava library years ago, where ImmutableList.of(…) would only support up to 20 arguments because there were 20 different instances of the method for each possible argument count.
i wish python were as fast as perl for equivalent string-based workloads
I’m someone who should be really excited about this, but I fundamentally don’t believe that a programming language can succeed behind a paywall or industry gatekeeper.
I’m always disappointed when I hear anything about mojo. I can’t even fully use it, to say nothing of examine it.
We all need money, and like to have our incubators, but the LLVM guy thinks like Jonathan Blow with jai?
I don’t see the benefit of joining an exclusive club to learn exclusively-useful things. That sounds more like a religion or MLM than anything RMS ever said :p
Was lowkey hoping this was about Mojolicious, the Perl web framework