Instantiation of Generic Reactions by Markus Krummenacker Q1 2015.

2 Generic Reactions Some enzymes have broad substrate specificity Therefore, many EC reactions are formulated as generic reactions, with some compound classes as substrates. Often, the full specificity range is unknown. A generic reaction is a more compact representation than listing every instance reaction explicitly BioCyc PGDBs have many generic reactions Examples: : NAD(P)+ + D-gluconate = NAD(P)H + 5-dehydro-D-gluconate + H : a 2,3,4-saturated fatty acyl CoA + FAD -> FADH2 + a 2,3-dehydroacyl-CoA

3 Problem with FBA The reaction network in FBA models is formulated in terms of specific compound instances Problem: disconnect between class and instance frames Example (to eventually produce cardiolipin): 1.D-glyceraldehyde-3-phosphate + phosphate + NAD+ -> 1,3-bisphospho-D-glycerate + H+ + NADH 2.dihydroxyacetonephosphate + NAD(P)H + H+ -> sn-glycerol-3-phosphate + NAD(P)+ Remedy: automatically generate instance-based versions from a generic reaction Runs as a preprocessing step. Instance reactions are not saved to the PGDB.

4 Cases of Generic Reactions Individual generic reactions: can be part of pathways or be standalone. The most common case. Polymerization pathways: a series of reactions needs instantiation for several cycles. Single polymerization reactions: like glycogen metabolism. Not handled currently. The success rate of instantiation depends on how thoroughly a PGDB was curated. There are still many generic reactions for which this does not work well.

5 Instantiation Algorithm Generic reaction: |Xs| + H2O = |Ys| |Xs| is a class with instances X1 X2 X3 |Ys| is a class with instances Y1 Y2 Instantiation code tries to pair all instances on LEFT and RIGHT sides with each other, substituting for the class, leading to temporary reactions like: X1 + H2O = Y1 X2 + H2O = Y1 X1 + H2O = Y2 etc. Test whether for a given instance in |Xs|, there is only 1 instance in |Ys| that leads to a mass-balanced reaction equation. If yes, create the instance-based reaction on the fly. (No chemical structure matching yet.) If an existing reaction frame for the instance based reaction can be found, it is used instead of the instantiation.

6 Requirements Reactions have to be fully mass balanced Compound instances need to be created, with structures Compound structures need pH7.3 protonation Compound instances have to be correctly classified under the classes used in generic reactions – Right-click command Edit->Compound Editor Multiple instances with identical chemical formula will be ambiguous

7 Practical Example Right-click command “Show reaction’s instantiations in terminal” – EC# – GLUCONATE-5-DEHYDROGENASE-RXN : NAD(P)+ + D-gluconate = NAD(P)H + 5-dehydro-D- gluconate + H+ [balanced] – successes: – NADP+ + D-gluconate = NADPH + 5-dehydro-D-gluconate + H+ – NAD+ + D-gluconate = NADH + 5-dehydro-D-gluconate + H+ – failures: – non-unique-balanced-instantiations (cannot decide which of several instantiations is correct): – success vs. failures vs. non-unique-balanced-instantiations: 2 / 0 / 0

8 Debugging of Pathways Right-click command “Show pathway’s instantiated reactions in terminal” Conveniently shows results for all reactions Debugging: If lots of problems, it helps to put a compound into the biomass that occurs early in pathway, to see if this at least can be produced Example pathways to instantiate: – proline biosynthesis I – L-idonate degradation

9 Special ETR Instantiation Electron Transfer Reactions (ETRs) refer to quinone classes, usually. Different isoprenoid tail lengths exist in various organisms. Uses NCBI taxonomy for selection of tail length. B. subtilis uses menaquinone-7 Default instantiation is ubiquinone-8 and menaquinone-8

10 Special Compartments Instantiation Schema change in BioCyc 15.0 regarding representing compartments of reactions Now, 1 reaction can be assigned to multiple compartments. FBA makes compartment-specific instantiated reactions to differentiate between the compartments

11 Syntax of Instantiation IDs Every instantiated reaction gets assigned a unique ID Visible in.sol file Constructed from the generic reaction and the IDs of the instance compounds on the left and right Format: GEN-RXN-ID-L1/L2//R1/R2.suffix-len. Non-default compartments: – GEN-RXN-ID[CCO-PERI-BAC]-L1/etc….

12 Polymerization Pathways Cyclic pathways of generic reactions

13 Polymerization Pathway Instantiation A series of instantiated reactions is needed to reach a product of a certain length Run cycle for 8 iterations (hard-coded, for now) Structures of class compounds have R groups The hallmark of a polymerization pathway is that 1 reaction, the polymerization step, is unbalanced. For now, the chemical formula of the misbalance is determined, which stands for the monomer unit. (No structural information is used, yet.) Appropriate instance compounds are searched by replacing the R groups with an integral multiple of the misbalance Still a bit experimental.

14 Instantiated reactions in.sol In the.sol file, instantiated reactions are listed in full detail, in the reactions sections In the.dat file, to be used for the Cellular Overview, fluxes of instantiated reactions are all combined into a value for the base generic reaction, because the Cellular Overview can only show the latter.

