Keras 3.0 is officially released: major updates integrate PyTorch and JAX, and 2.5 million developers around the world are in use 07/12 Update SLTechnology News&Howtos

Keras 3.0 is officially released: major updates integrate PyTorch and JAX, and 2.5 million developers around the world are in use

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)12/24 Report--

Today, Keras, a popular deep learning framework for developers, has officially updated version 3.0 to support PyTorch and JAX, improve performance, and easily implement large-scale distributed training.

After five months of public Beta testing, Keras 3.0, a deep learning framework, is finally available to all developers.

The new Keras 3 completely rewrites the Keras codebase to run on JAX, TensorFlow and PyTorch, unlocking new features for training and deploying new large models.

François Chollet,"Father of Keras," also made several announcements before the latest version was released. Currently, 2.5 million developers are using the Keras framework.

We just released Keras 3.0!

Running Keras on JAX, TensorFlow, and PyTorch

Training faster with XLA compilation

Unlock training runs on any number of devices and hosts via the new Keras distribution API

It's now online at PyPI.

Developers can even use Keras as a low-level cross-framework language to develop custom components, such as layers, models, or metrics.

With a single codebase, these components are available in native workflows in JAX, TensorFlow, PyTorch.

Once again, Keras can run on Theano, TensorFlow, CNTK, and even MXNet.

In 2018, TensorFlow seemed to be the only viable option since Theano and CNTK had ceased development, so Keras focused on TensorFlow.

But this year, things have changed.

According to the StackOverflow Developer Survey 2023 and the Kaggle Machine Learning and Data Science Survey 2022,

TensorFlow has a 55% to 60% market share and is ML's first choice in production.

PyTorch has a 40 to 45 percent market share and is ML's top choice for research.

JAX, meanwhile, although with a much smaller market share, has been embraced by top players in the generative AI space such as Google DeepMind, Midjourney, Cohere, etc.

The development team then completely rewritten the Keras codebase, and the new Keras 3.0 was refactored based on a modular backend architecture with the ability to run on arbitrary frameworks.

The new Keras also ensure compatibility, for example, when using TensorFlow backend, you can simply use import keras_core as keras instead of from tensorflow import keras

Existing code will run without problems, and usually performance improves slightly due to XLA compilation.

Keras vs. TensorFlow Here is an example of how to convert TensorFlow code to Keras form.

TensorFlow Core Implementation

Keras implementation

In contrast, we can clearly see the simplicity that Keras brings.

TensorFlow allows finer control over each variable, while Keras offers ease of use and the ability to prototype quickly.

For some developers, Keras saves development headaches, reduces programming complexity, and saves time costs.

New features in Keras 3.0 The biggest advantage of Keras is that it enables high-speed development through excellent UX, API design, and debuggability.

It's also a battle-tested framework that powers some of the world's most complex and large-scale ML systems, such as Waymo's self-driving car and YouTube's recommendation engine.

So, what are the additional advantages of using the new multi-backend Keras 3?

- Always get the best performance for your model. In benchmarking, JAX was found to generally provide the best training and inference performance on GPUs, TPU, and CPU, but results varied from model to model as non-XLA TensorFlow was occasionally faster on GPUs.

Its ability to dynamically select the backend that provides the best performance for the model without any changes to the code means developers can train and service at maximum efficiency.

- Unlock ecosystem optionality for models. Any Keras 3 model can be instantiated as a PyTorch module, exported as a TensorFlow SavedModel, or instantiated as a stateless JAX function.

This means developers can use the Keras 3 model with the PyTorch ecosystem pack, the full range of TensorFlow deployment and production tools such as TF-Serving, TF.js and TFLite, and the JAX Massive TPU training infrastructure. Write a model.py using the Keras 3 API to access everything the ML world has to offer.

- Take advantage of JAX's large-scale model parallelism and data parallelism. Keras 3 includes a new distributed API, the keras.distribution namespace, which is currently implemented on the JAX backend (soon to be implemented on TensorFlow and PyTorch backends).

Model parallelism, data parallelism, and combinations of the two can be easily achieved at arbitrary model scales and clustering scales. Because it separates model definition, training logic, and sharding configuration from each other, it makes distribution workflows easy to develop and maintain.

- Maximize the coverage of open source model versions. Want to publish a pre-trained model? Want as many people as possible to use it? If you implement it in pure TensorFlow or PyTorch, it will be used by about half the community.

If you implement it in Keras 3, then anyone can use it immediately, regardless of the framework they choose (even if they are not Keras users themselves). Twice the impact without increasing development costs.

- Use data pipelines from any source. The Keras 3 / fit () / evaluate () predict () routine is compatible with tf.data.Dataset objects, PyTorch DataLoader objects, NumPy arrays, Pandas data frames-whatever backend you use. You can train the Keras 3 + TensorFlow model on PyTorch DataLoader, or you can train the Keras 3 + PyTorch model on tf.data.Dataset.

pre-trained models

Developers are now ready to start working with Keras 3's various pre-trained models.

All 40 Keras application models (keras.applications namespace) are available on all backends. A large number of pre-trained models in KerasCV and KerasNLP also apply to all back-ends.

These include:

- BERT

- OPT

- Whisper

- T5

- Stable Diffusion

- YOLOv8

Cross-Framework Development

Keras 3 enables developers to create components that are the same in any framework (such as arbitrary custom layers or pre-trained models), and it allows access to the keras.ops namespace for all backends.

Keras 3 contains a full implementation of the NumPy API-not "NumPy-like," but a true NumPy API with the same functions and parameters. For example, ops.matmul, ops.sum, ops.stack, ops.einsum, etc.

Keras 3 also contains a set of neural network-specific functions not found in NumPy, such as ops.softmax, ops.binary_crossentropy, ops.conv, etc.

In addition, as long as developers use all operations from keras.ops, custom layers, loss functions, and optimizers can use the same code across JAX, PyTorch, and TensorFlow.

Developers only need to maintain one component implementation and use it in all frameworks.

Keras Architecture Next, let's take a look at the mechanics and architecture of Keras.

In Keras, the Sequential and Model classes are at the heart of model building, providing a framework for assembling layers and defining computational graphs.

Sequential is a linear stack of layers. It is a subclass of Model, designed for simple cases, which consists of a linear layer stack with one input and one output.

The Sequential class has the following main characteristics:

Simplicity: Just list layers in the order you want them to be executed.

Automatic Forward Transfer: When adding layers to a Sequential model, Keras automatically connects the output of each layer to the input of the next layer, creating a forward transfer without manual intervention.

Internal state management: Sequential manages the state (such as weights and biases) and calculation graphs of layers. When compile is called, it configures the learning process by specifying optimizers, loss functions, and metrics.

Training and inference: The Sequential class provides fit, evaluate, and predict methods for training, evaluating, and predicting models, respectively. These methods internally handle training loops and reasoning processes.

The Model class is used with a functional API and provides more flexibility than Sequential. It is designed for more complex architectures, including models with multiple inputs or outputs, shared layers, and nonlinear topologies.

The main features of the Model class are:

Layer diagrams: Model allows you to create layer diagrams, allowing one layer to be connected to multiple layers, not just the one above and the one below.

Explicit input and output management: In a functional API, you can explicitly define the inputs and outputs of your model. This allows for more complex architectures than Sequential.

Connection flexibility: The Model class can handle models with branches, multiple inputs and outputs, and shared layers, making it suitable for a wide range of applications beyond simple feedforward networks.

State and training management: The Model class manages the state and training process of all layers, while providing more control over how layers are connected and how data flows through the model.

Both the Model class and the Sequential class rely on the following mechanisms:

Layer registration: When you add layers to these models, the layers are registered internally and their parameters are added to the model's parameter list.

Automatic differentiation: During training, Keras uses automatic differentiation provided by backend engines (TensorFlow, etc.) to calculate gradients. This process is transparent to users.

Back-end execution: The actual calculations (e.g. matrix multiplication, activation, etc.) are handled by the back-end engine, which executes the computation graph defined by the model.

Serialization and deserialization: These classes include methods for saving and loading models, which involve serialization of model structure and weights.

Essentially, the Model and Sequential classes in Keras abstract away most of the complexity involved in defining and managing computational graphs, enabling users to focus on the architecture of the neural network rather than the underlying computational mechanisms.

Keras automatically handles the intricate details of how layers connect to each other, how data flows through the network, and how training and inference operations are performed.

For Keras 'big update, some netizens used the following pictures to express their views:

I don't know why TensorFlow is being blown up.

Some netizens said that they could just use it:

Another netizen sent a congratulatory message saying,"Using Keras on top of PyTorch is an amazing achievement!"

Of course, there are also netizens who disagree,"I wonder why someone would use Keras + Torch instead of ordinary Torch, because Torch is different from Tensorflow, it has a good set of APIs."

Tensorflow's heart at this time: Ah, yes, yes, you are all right.

References:

https://twitter.com/fchollet/status/1729512791894012011

https://keras.io/keras_3/

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.