Gradient descent and calculations of gradients

In order to optimize the VQE ansatz, we need to compute derivatives with respect to the variational parameters. Here we develop first a simpler approach tailored to the one-qubit case. For this particular case, we have defined an ansatz in terms of the Pauli rotation matrices. These define an arbitrary one-qubit state on the Bloch sphere through the expression

$$ \vert\psi\rangle = \vert \psi(\theta,\phi)\rangle =R_y(\phi)R_x(\theta)\vert 0 \rangle. $$

Each of these rotation matrices can be written in a more general form as

$$ R_{i}(\gamma)=\exp{-(\imath\frac{\gamma}{2}\sigma_i)}=\cos{(\frac{\gamma}{2})}\boldsymbol{I}-\imath\sin{(\frac{\gamma}{2})}\boldsymbol{\sigma}_i, $$

where \( \sigma_i \) is one of the Pauli matrices \( \sigma_{x,y,z} \).

It is easy to see that the derivative with respect to \( \gamma \) is

$$ \frac{\partial R_{i}(\gamma)}{\partial \gamma}=-\frac{\gamma}{2}\boldsymbol{\sigma}_i R_{i}(\gamma). $$

We can now calculate the derivative of the expectation value of the Hamiltonian in terms of the angles \( \theta \) and \( \phi \). We have two derivatives

$$ \frac{\partial}{\partial \theta}\left[\langle \psi(\theta,\phi) \vert \boldsymbol{H}\vert \psi(\theta,\phi)\rangle\right]=\frac{\partial}{\partial \theta}\left[\langle\boldsymbol{H}(\theta,\phi)\rangle\right]=\langle \psi(\theta,\phi) \vert \boldsymbol{H}(-\frac{\imath}{2}\boldsymbol{\sigma}_x\vert \psi(\theta,\phi)\rangle+\hspace{0.1cm}\mathrm{h.c}, $$

and

$$ \frac{\partial }{\partial \phi}\left[\langle \psi(\theta,\phi) \vert \boldsymbol{H}\vert \psi(\theta,\phi)\rangle\right]=\frac{\partial}{\partial \phy}\left[\langle\boldsymbol{H}(\theta,\phi)\rangle\right]=\langle \psi(\theta,\phi) \vert \boldsymbol{H}(-\frac{\imath}{2}\boldsymbol{\sigma}_y\vert \psi(\theta,\phi)\rangle+\hspace{0.1cm}\mathrm{h.c}. $$

This means that we have to calculate two additional expectation values in addition to the expectation value of the Hamiltonian itself. In our first attempt, we will compute these expectation values in a brute force way, performing the various matrix-matrix and matrix-vector multiplications. As the reader quickly will see, this approach becomes unpractical if we venture beyond some few qubits. We will try to rewrite the above derivatives in a smarter way, see the article by Maria Schuld et al.