New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for high level synthesis plugins to return a DAGCircuit #12203
base: main
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 8896892592Details
💛 - Coveralls |
…plugins. Fix conflicts.
One or more of the the following people are requested to review this:
|
from .diagonal import Diagonal | ||
|
||
_EPS = 1e-10 # global variable used to chop very small numbers to zero | ||
_DECOMPOSER1Q = OneQubitEulerDecomposer("U3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was never used, I removed it to make the cyclic import handling easier but if it is necessary, I can bring it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am completely in favor of removing unused imports.
…e unchanged files, add missing tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ElePT! In general, the changes look fine to me, however I do have several comments and would like to discuss this a bit more before approving.
From the user perspective, when writing a synthesis plugin, having to return both the QuantumCircuit
version (when use_dag
is False
) and the DAGCircuit
version (when use_dag
is True
) seems a bit too much, and is technically not backwards-compatible. Yet I like the idea of passing the flag use_dag
to indicate which of the two versions is preferred, when one can create both. In other words, the user may ignore this flag and return only one version of the circuit, the HighLevelSynthesis
pass is then responsible to convert it to the right format when needed. But for performance-critical applications the user may want to support both versions and return the preferred one based on the flag. What do you think?
Speaking of this, the two mostly used clifford synthesis functions are currently synthesize_clifford_greedy
and synthesize_clifford_bm
(the latter is for at most 3-qubit cliffords). While it may make sense to optimize these two specific functions, it probably makes less sense to optimize e.g. synthesize_clifford_ag
, @ShellyGarion, what do you think?(Also cc @mtreinish).
It would be nice to have some idea how much performance improvement we can get by directly constructing the DAGCircuit
vs. constructing a QuantumCircuit
and running dag_to_circuit
.
Please also see my comment on possible ideas on avoiding code duplication.
There is also a question on how this relates to eventually porting (at least some of the performance-critical) code to Rust.
What do you think?
from .diagonal import Diagonal | ||
|
||
_EPS = 1e-10 # global variable used to chop very small numbers to zero | ||
_DECOMPOSER1Q = OneQubitEulerDecomposer("U3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am completely in favor of removing unused imports.
circuit = QuantumCircuit(clifford.num_qubits, name=str(clifford)) | ||
qreg = QuantumRegister(clifford.num_qubits) | ||
if use_dag: | ||
circuit = DAGCircuit() | ||
circuit.name = str(clifford) | ||
circuit.add_qreg(qreg) | ||
else: | ||
circuit = QuantumCircuit(qreg, name=str(clifford)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally this looks fine, yet I do find that this introduces code duplication and that the new code a bit harder to read and to maintain.
How much time do we save (here and in other synthesis functions) by directly constructing a DAGCircuit
rather than constructing a QuantumCircuit
and converting it to DAGCircuit
using dag_to_circuit
?
For instance, If in practice this time is negligible, then maybe we don't really need the DAGCircuit
version of the code and we can just let HighLevelSynthesis
convert it to DAGCircuit
(when needed).
And if this time is not negligible, then what exactly causes the slowdown? For instance, would it be possible to create some kind of "list of gates" instead of (the presumably more expensive) QuantumCircuit
/DAGCircuit
, and only construct the right object from this list at the very end?
Alternatively, maybe we can split synth_clifford_ag
into two functions like _synth_clifford_ag_circuit
and _synth_clifford_ag_dag
that produce QuantumCircuit
and DAGCircuit
respectively? (Unfortunately this would introduce even more code duplication but imho would make each individual function more readable).
Or maybe we can have some generic functions like _append_cx
or _append_h
, etc. that under the hood use the correct method of adding a gate to either a DAGCircuit
or a QuantumCircuit
(and I think we might already have something of this kind).
Also this somewhat relates to the question/task of whether we eventually want to port these algorithms to Rust.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point @alexanderivrii, and I agree that the current code is less readable. I am working on adding a HLS plugin benchmark to our ASV suite to be able to test if we are actually getting relevant performance improvements or not, but I haven't had time to finish it yet. I think that the best course of action would be to finish the benchmark and make a decision based off the results. I also considered the option to have a full dag-based function as an alternative, but it's a lot of functions to migrate and I wanted to get some feedback on the current state first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made a first attempt at benchmarking and I am getting mixed results, some tests improve with the PR but others show a performance regression (marked with an asterisk). The benchmark is done with the code in (#12329), which is still WIP, so they may also not be the best benchmarks.
[Edit]: took a look offline and some of the plugins might not be running as expected, so I wouldn't take the results too seriously.
Before After
transpiler_hls_plugins.HLSPluginsSuite.time_clifford_ag_large_circuit 909±4ms 967±30ms (*)
transpiler_hls_plugins.HLSPluginsSuite.time_clifford_ag_rand_benchmarking 2.20±0.01s 2.00±0.08s
transpiler_hls_plugins.HLSPluginsSuite.time_clifford_bm_large_circuit 936±20ms 960±30ms (*)
transpiler_hls_plugins.HLSPluginsSuite.time_clifford_bm_rand_benchmarking 5.93±0.09s 2.50±0.06s
transpiler_hls_plugins.HLSPluginsSuite.time_clifford_default_large_circuit 923±30ms 922±20ms
transpiler_hls_plugins.HLSPluginsSuite.time_clifford_default_rand_benchmarking 6.25±0.1s 6.41±0.2s (*)
transpiler_hls_plugins.HLSPluginsSuite.time_clifford_greedy_large_circuit 926±20ms 969±30ms (*)
transpiler_hls_plugins.HLSPluginsSuite.time_clifford_greedy_rand_benchmarking 8.86±0.2s 9.81±0.06s (*)
transpiler_hls_plugins.HLSPluginsSuite.time_linear_func_kms 3.11±0.07s 2.64±0.03s
transpiler_hls_plugins.HLSPluginsSuite.time_linear_func_pmh 1.84±0.08s 1.66±0.04s
layered_circuit.append(CX_circ, qubit_list, copy=False) | ||
layered_circuit.append(H2_circ, qubit_list, copy=False) | ||
layered_circuit.append(S1_circ, qubit_list, copy=False) | ||
layered_circuit.append(CZ1_circ, qubit_list, copy=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea (not copying)! I now seem to remember that either _append
or compose
should be even more performant (not sure).
decomposition = synth_cnot_depth_line_kms(mat, use_dag=use_dag) | ||
|
||
if use_dag: | ||
if use_transposed: | ||
transposed_circ = decomposition.copy_empty_like() | ||
# QuantumCircuits get a default name, while DAGCircuits don't | ||
transposed_circ.name = decomposition.name or "synth_circuit" | ||
transposed_circ.name += "_transpose" | ||
for node in decomposition.topological_op_nodes(): | ||
if node.op.name != "cx": | ||
raise CircuitError("The circuit contains non-CX gates.") | ||
transposed_circ.apply_operation_front( | ||
node.op, reversed(node.qargs), node.cargs, check=False | ||
) | ||
decomposition = transposed_circ | ||
if use_inverted: | ||
inverted_circ = decomposition.copy_empty_like() | ||
for node in decomposition.topological_op_nodes(): | ||
inverted_circ.apply_operation_front( | ||
node.op.inverse(), node.qargs, node.cargs, check=False | ||
) | ||
decomposition = inverted_circ | ||
return decomposition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another place where (imho) supporting both QuantumCircuit
and DAGCircuit
makes the code quite a bit more complex. I am wondering how much performance improvement we get in practice.
def random_clifford_circuit(num_qubits, num_gates, gates="all", seed=None): | ||
"""Generate a pseudo random Clifford circuit.""" | ||
|
||
qubits_1_gates = ["i", "x", "y", "z", "h", "s", "sdg", "sx", "sxdg", "v", "w"] | ||
qubits_2_gates = ["cx", "cz", "cy", "swap", "iswap", "ecr", "dcx"] | ||
if gates == "all": | ||
if num_qubits == 1: | ||
gates = qubits_1_gates | ||
else: | ||
gates = qubits_1_gates + qubits_2_gates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function still exists in the test_cliffords
file, right? Maybe it would be nice to move it (and also random_linear_circuit
) into the main qiskit
repository. I don't know why we have not done this already, but more than once I wished we had this in the main codebase.
# pylint: disable=invalid-name | ||
"""Tests for Clifford synthesis functions.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much like that this functionality was moved to a separate file, thanks!
Summary
This PR addresses #11681 by adding a
use_dag
flag to the high level synthesis pluginsrun
method (and synthesis functions) to internally use and return aDAGCircuit
instead of aQuantumCircuit
. The flag is set toTrue
when the plugins are called inside theHighLevelSynthesis
pass to avoid back and forth conversions between circuit and dag.This PR makes sure that the interface can handle
DAGCircuit
plugin outputs and implements the full end-to-endDAGCircuit
pipeline for most of the synthesis functions/plugins (see table below). The only function that I didn't think made much sense to re-implement withDAGCircuit
wassynth_clifford_layers
, because it works by constructing independent circuits that are then treated as instruction blocks, and I think thatQuantumCircuit
is better suited for this particular workflow. So I just convert the circuit to dag during the function return.Other small changes that I included in this PR are:
QuantumCircuit.append(.., copy=False)
where possible inside the synthesis functions (should be faster)test/python/quantum_info/operators/symplectic/test_clifford.py
. I migrated those tests to a newtest/python/synthesis/test_clifford_sythesis.py
file.DAGCircuit
in the synthesis functions, I did this by making allqiskit.synthesis
imports inqiskit.circuit
only imported at runtime. I think that this is a coherent move and should reduce the risk of cyclic imports in the future, but I am a bit weary of creating further issues so let me know if you see anything suspicious in this change.Details and comments
Because I found the structure of the plugins a bit convoluted at first, I have compiled a table where I have been keeping track of the modified plugins, related synthesis functions, whether they are able to use the dag end-to-end and where one can find the function tests (which have been modified to test
use_dag
):DefaultSynthesisClifford
synth_clifford_full
AGSynthesisClifford
synth_clifford_ag
BMSynthesisClifford
synth_clifford_bm
GreedySynthesisClifford
synth_clifford_greedy
LayerSynthesisClifford
synth_clifford_layers
LayerLnnSynthesisClifford
synth_clifford_depth_lnn
DefaultSynthesisLinearFunction
KMSSynthesisLinearFunction
synth_cnot_depth_line_kms
PMHSynthesisLinearFunction
synth _cnot_count_full_pmh
KMSSynthesisPermutation
synth_permutation_depth_lnn_kms
BasicSynthesisPermutation
synth_permutation_basic
ACGSynthesisPermutation
synth_permutation_acg
TokenSwapperSynthesisPermutation