Modern robotics research often needs gradients that pass cleanly through physics. You might want to tune a feedback policy with backprop, identify physical parameters from rollouts, or train a network that ingests sensor streams and outputs controls. CppMjStep is my answer to that workflow. It plugs MuJoCo into PyTorch as a custom autograd op written in C++, so you can differentiate through full simulation trajectories and keep the speed you need for real experiments. The extension supports multi-step unrolling, batching, and sensor gradients, and it uses multithreading for better throughput. On large models with large batch sizes, my measurements show more than a 2x speedup over a pure Python approach.
Differentiating through physics is powerful but expensive. Finite differences across a rollout can balloon into thousands of simulator calls. CppMjStep leans on MuJoCo’s built-in derivative routines to keep the math both correct and fast. Under the hood, it calls functions like mjd_transitionFD
to obtain state and control Jacobians for the discrete dynamics, then chains them across the horizon so you can backpropagate through your entire rollout.
Two design choices make this practical for day-to-day work. First, the core op is implemented in C++ and wired into PyTorch’s autograd so gradient flow feels native while avoiding Python overhead. Second, it adds sensor support, which means you can include IMU, force, or other sensor readings in the computational graph with minimal extra cost. If you have ever wanted end-to-end gradients from controls through contact rich dynamics and back through sensors, this is built for that.
mjd_transitionFD
and friends.
Here is a minimal example that treats the simulator as a differentiable layer. Create your model and data in MuJoCo, wrap them with MjStep
, and call it like a PyTorch module.
import torch
import mujoco as mj
from CppMjStep import MjStep, PyMjStep # C++ op and a Python reference
# Load model
xml_path = "path/to/model.xml"
m = mj.MjModel.from_xml_path(filename=xml_path)
d = mj.MjData(m)
# Build differentiable step, choose horizon and threading
step = MjStep(m, d, n_steps=5, multithread=False)
# State and control tensors
state = torch.rand(m.nq + m.nv + m.na + m.nsensordata, requires_grad=True)
ctrl = torch.rand(m.nu, requires_grad=True)
# One differentiable rollout step
next_state = step(state, ctrl)
loss = next_state.square().mean()
loss.backward()
First, MuJoCo 3.1.2 expects the initial state for rollouts to use the full physics state size that includes time, and the project notes that requirement explicitly. Second, you can toggle multithreading on initialization to trade a little determinism for speed.
The repo includes timing plots that compare the C++ extension with a Python reference on models with and without sensors. The headline trend is simple: as batch size and model size grow, the C++ path pulls away and passes a 2x speedup in typical reinforcement learning workloads. That difference is what turns a proof of concept into a trainable loop.
MuJoCo exposes discrete time Jacobians through mjd_transitionFD
, which returns the matrices
A = ∂y/∂x and B = ∂y/∂u for a single step.
CppMjStep composes these across steps so gradients flow from the loss back to the initial state and controls.
You keep the convenience of PyTorch while relying on MuJoCo’s finite differencing that is tailored to its solver and contact model.
CppMjStep builds on the idea introduced by DiffMjStep and extends it with sensor support and a multithreaded C++ core. If you have used DiffMjStep before, the mental model carries over, so moving projects onto the faster path should be straightforward.
If your problem needs extremely long horizons or very high resolution time steps, remember that backprop through time still carries memory cost. CppMjStep helps with speed, but you will still want techniques like truncated horizons or gradient checkpointing in heavy regimes.
The package is pip installable, with environment specifics and troubleshooting in the project’s install guide. The README also lists the precise state size requirement for MuJoCo 3.1.2, along with usage snippets and benchmark figures.
CppMjStep draws inspiration from DiffMjStep and uses MuJoCo’s official derivative functions to keep results consistent with the core simulator.