Queries
Here, we list some example queries. Just copy and paste queries into the web interface mentioned above. Very helpful when writing queries is the Cypher reference card
Query 1
1 2 3 4 5 6 7 8 9 10 11 | MATCH (species:SBML_SPECIES)-[isMod:IS_MODIFIER]->() WHERE NOT((species)-[:IS_REACTANT]->() OR (species)-[:IS_PRODUCT]->()) WITH species, COUNT(isMod) AS numOfMod ORDER BY numOfMod DESC LIMIT 1 MATCH species-[:BELONGS_TO]->model WHERE (model:SBML_MODEL) RETURN model.NAME AS Model, species.NAME AS Species, numOfMod |
Query 1: Return the model with the most species acting only as a modifier.
Result 1: The model “Schaber2012 – Hog pathway in yeast” having the species Hog1PPActive which is acting as a modifier in ten reactions.
Query 2
1 2 3 4 5 6 7 | MATCH (m:SBML_MODEL)-[:REFERENCES_SIMULATION_MODEL]-REF -[:BELONGS_TO*2]->(sed:DOCUMENT) WHERE m.NAME='Novak1997 - Cell Cycle' RETURN m.NAME AS Model, m.ID AS ModelID, REF.MODELSOURCE AS ModelSource, sed.FILENAME AS SEDMLFile |
Query 2: Return all simulations that can be applied to the model “Novak1997 – Cell Cycle”
Result 2: The requested model can be run by two simulations, reproducing Figure 2a and 2b by Novak 97
Query 3
1 2 3 4 5 6 7 | MATCH (sed:DOCUMENT)<-[:BELONGS_TO*2]-(sim:SEDML_SIMULATION)-[:SIMULATES] ->(REF:SEDML_MODELREFERENCE)-[:REFERENCES_SIMULATION_MODEL]->m WHERE (sim.SIMKISAO='KISAO:0000019') AND FILTER(lable IN labels(m) WHERE lable ='CELLML_MODEL') RETURN m.NAME, sed.FILENAME |
Query 3: Return only CellML models that can be simulated using a Livermore Solver (KISAO:0000019).
Result 3: The CellML encoded “Tyson 1991” model and the corresponding SED-ML file.
Query 4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | START res=node:annotationIndex('RESOURCETEXT:(m-phase inducer phosphatase)') MATCH res<-[rel:IS]-(a:ANNOTATION)-->(s:SBML_SPECIES) <-[:OBSERVES]-o-[:BELONGS_TO*]->(doc:DOCUMENT) WITH doc,res,s,o MATCH ()<-[:IS_MODIFIER]-s-[:BELONGS_TO]->m RETURN DISTINCT doc.FILENAME AS SEDML, collect(DISTINCT m.NAME) AS Model, collect(DISTINCT res.URI) AS Resource, collect(DISTINCT s.NAME) AS Species, collect(DISTINCT o.TARGET) AS Target |
Query 4: Return simulation descriptions observing a particular species that plays the role of a modifier or reaction, respectively. The observed species should be annotated as “m-phase inducer phosphatase” using the qualifier is.
Result 4: The result is shown and explained in Figure 3.
Query 5
1 2 3 4 5 6 7 | MATCH (r:RESOURCE)-[qualifier:BELONGS_TO]->() WITH r, COUNT(qualifier) AS AnnotationCount ORDER BY AnnotationCount DESC LIMIT 3 RETURN r.URI AS Annotation, AnnotationCount |
Query 5: What are the top-most three annotations used
Result 5: Top three annotations used are SBO:0000009 (1127 times), SBO:0000252 (509 times), GO:0043241 (484 times)
Query 6
1 2 3 4 5 | MATCH ()-[rel]->(res:RESOURCE)-[:IS_ONTOLOGY_ENTRY]-c-[:isA*0..]->s WHERE s.id="SBO_0000009" RETURN COUNT(rel) |
Query 6: How many annotations point to the term SBO:0000009 or one of its children?
Result 6: 3373 annotations pointing to SBO:0000009 or one of its children, 1127 of them point directly to SBO:0000009.
Query 7
1 2 3 4 5 6 7 | MATCH (m:SBML_MODEL)<-[:BELONGS_TO*1..2]-(a:ANNOTATION) <-[:BELONGS_TO]-(r:RESOURCE) WITH m AS Model, COUNT(r) AS NumberOfAnnotation RETURN MAX(NumberOfAnnotation), MIN(NumberOfAnnotation), avg(NumberOfAnnotation), stdev(NumberOfAnnotation) |
Query 7: What is the minimum, maximum and average number of annotations per model?
Result 7: A model has a maximum of 800, a minimum of three and an average of 71 annotations.
Query BM1
1 2 3 4 5 | MATCH (m:SBML_MODEL)-->(s:SBML_SPECIES) WHERE (m.ID="BIOMD0000000001") RETURN m AS Model, collect(s.ID) AS SpeciesID, collect(s.NAME) AS SpeciesName |
Query BM1: From model BIOMD0000000001, list all species identifiers and names
Result BM1: 12 species IDs (ALL, I, DL, ILL, D, DLL, B, BL, A, AL, IL, BL) and names (ActiveACh2, Intermediate, …)
Query BM2
1 2 3 4 5 6 | MATCH (r:RESOURCE)-->()-[:BELONGS_TO]->(element)-->(m:SBML_MODEL) WHERE m.ID="BIOMD0000000001" RETURN element.ID AS Element, LABELS(element) AS ElmentType, collect(r.URI) AS ElementAnnotation |
Query BM2: Get element annotations of the model BIOMD0000000001
Result BM2: 104 annotations for 65 distinct elements, for example species ALL is annotated with IPR002394, GO:0005892 and SBO:0000297
Query BM3
1 2 3 4 5 6 | MATCH (r:RESOURCE)<-[rel]-()-->e-[:BELONGS_TO]->(m:SBML_MODEL) WHERE r.URI=~".*GO.*0005892" RETURN m.ID AS ModelID, collect(e.ID) AS ElementIDs, TYPE(rel) AS Qualifier, r.URI AS URI |
Query BM3: All model elements with annotations to acetylcholine-gated channel complex.
Result BM3: From each model (BIOMD0000000001 and BIOMD0000000002) the same 12 species IDs are returned (ALL, I, DL, ILL, D, DLL, B, BL, A, AL, IL, BL), all are qualified with isVersionOf.
Query P1
1 2 3 4 5 6 7 | MATCH (res:RESOURCE), (sbo:SBOOntology) WHERE (res.URI =~ ".*SBO.*") AND (RIGHT(res.URI, 7) = RIGHT(sbo.id, 7)) CREATE res-[link:IS_ONTOLOGY_ENTRY]->sbo RETURN COUNT(link); |
Query P1: Select and match and link the SBO annotations extracted from models with corresponding concepts from the SB-Ontology.
Result P1: The number of created links.
Query M1
1 2 3 | MATCH (m:CELLML_MODEL) RETURN m |
Query M1: Database look-up. Return all CellML models
Result M1: List of 841 models
Query M2
1 2 3 4 5 | MATCH (m:CELLML_MODEL) WHERE m.NAME = 'tyson_1991' RETURN m |
Query M2: Database look-up and filtering. Return CellML models with the name “tyson_1991″
Result M2: A model node containing the attribute NAME:”tyson_1991”
Query M3
1 2 3 4 5 | MATCH (m:CELLML_MODEL)-->(c:CELLMLCOMPONENT) WHERE m.NAME = 'tyson_1991' RETURN c.NAME |
Query M3: Database graph structure query. Select the aforementioned Tyson model and return all its components.
Result M3: The components YP, Y, M, pM, CP, C2, environment and reaction_constants.
Query M4
1 2 3 4 5 | MATCH (m:CELLML_MODEL)-->(c:CELLMLCOMPONENT)-->(v:CELLMLVARIABLE) WHERE m.NAME = 'tyson_1991' RETURN COUNT(v) |
Query M4: Database aggregation query. Count the number of variables contained by any component of the aforementioned Tyson model
Result M4: This model has 68 variables.
Query M5
1 2 3 4 5 | MATCH (m:CELLML_MODEL)-->(c:CELLMLCOMPONENT)-->(v:CELLMLVARIABLE) WITH c AS component, COUNT(v) AS NumOfVar RETURN MIN(NumOfVar), MAX(NumOfVar), avg(NumOfVar), stdev(NumOfVar) |
Query M5: Statistics query. Retrieve minimum, maximum average and standard derivation of for the number of variables attached to a component.
Result M5: A minimum of one and a maximum of 431 variables are attached to a component of a CellML model. On average each component has 9.64 variables attached with a standard derivation of almost 16.
Query M6
1 2 3 | START res=node:annotationIndex('RESOURCETEXT:(m-phase inducer phosphatase)') RETURN res |
Query M6: Database index query. Retrieve all annotations containing the phrase “m-phase inducer phosphatase”
Result M6: A set of seven resources (InterPro IPR000751; Enzyme Commission number 3.1.3.48; and UniProt: P30311, P23748, P20483, P06652, P30304)
Nodes and Relationships
Neo4J connects two nodes by a relationship. Here we list all possible types of nodes and all relationships possible between two node types. This list is comparable to the database schema for relational databases.
Node | Relationship | Node |
ANNOTATION | BELONGS_TO | MODEL |
ANNOTATION | IS_CREATOR | PERSON |
ANNOTATION | HAS_PUBLICATION | PUBLICATION |
ANNOTATION | isDescribedBy | RESOURCE |
ANNOTATION | is | RESOURCE |
ANNOTATION | isVersionOf | RESOURCE |
ANNOTATION | occursIn | RESOURCE |
ANNOTATION | BELONGS_TO | SBML_COMPARTMENT |
ANNOTATION | HAS_SBOTERM | RESOURCE |
ANNOTATION | BELONGS_TO | SBML_SPECIES |
ANNOTATION | BELONGS_TO | SBML_REACTION |
ANNOTATION | BELONGS_TO | SBML_PARAMETER |
ANNOTATION | BELONGS_TO | SBML_EVENT |
ANNOTATION | isHomologTo | RESOURCE |
ANNOTATION | hasVersion | RESOURCE |
ANNOTATION | isDerivedFrom | RESOURCE |
ANNOTATION | hasPart | RESOURCE |
ANNOTATION | hasProperty | RESOURCE |
ANNOTATION | encodes | RESOURCE |
ANNOTATION | isPartOf | RESOURCE |
ANNOTATION | BELONGS_TO | SBML_RULE |
ANNOTATION | BELONGS_TO | SBML_FUNCTION |
ANNOTATION | isEncodedBy | RESOURCE |
CELLMLCOMPONENT | BELONGS_TO | MODEL |
CELLMLCOMPONENT | HAS_VARIABLE | CELLMLVARIABLE |
CELLMLCOMPONENT | IS_CONNECTED_TO | CELLMLCOMPONENT |
CELLMLCOMPONENT | BELONGS_TO | CELLMLREACTION |
CELLMLREACTION | HAS_REACTION | CELLMLCOMPONENT |
CELLMLVARIABLE | BELONGS_TO | CELLMLCOMPONENT |
CELLMLVARIABLE | IS_MAPPED_TO | CELLMLVARIABLE |
CELLMLVARIABLE | HAS_DELTA_VAR | CELLMLVARIABLE |
CELLMLVARIABLE | IS_DELTA_VAR | CELLMLVARIABLE |
DOCUMENT | HAS_MODEL | MODEL |
DOCUMENT | HAS_SEDML | SEDML |
GOOntology | isA | GOOntology |
KISAOOntology | isA | KISAOOntology |
MODEL | BELONGS_TO | DOCUMENT |
MODEL | HAS_ANNOTATION | ANNOTATION |
MODEL | HAS_COMPONENT | CELLMLCOMPONENT |
MODEL | HAS_REACTION | SBML_REACTION |
MODEL | HAS_COMPARTMENT | SBML_COMPARTMENT |
MODEL | HAS_SPECIES | SBML_SPECIES |
MODEL | HAS_PARAMETER | SBML_PARAMETER |
MODEL | HAS_EVENT | SBML_EVENT |
MODEL | HAS_RULE | SBML_RULE |
MODEL | HAS_FUNCTION | SBML_FUNCTION |
PERSON | BELONGS_TO | PUBLICATION |
PERSON | BELONGS_TO | ANNOTATION |
PUBLICATION | BELONGS_TO | ANNOTATION |
PUBLICATION | HAS_AUTHOR | PERSON |
RESOURCE | BELONGS_TO | ANNOTATION |
RESOURCE | IS_ONTOLOGY_ENTRY | GOOntology |
RESOURCE | IS_ONTOLOGY_ENTRY | SBOOntology |
SBML_COMPARTMENT | BELONGS_TO | MODEL |
SBML_COMPARTMENT | HAS_ANNOTATION | ANNOTATION |
SBML_COMPARTMENT | CONTAINS_SPECIES | SBML_SPECIES |
SBML_EVENT | BELONGS_TO | MODEL |
SBML_EVENT | HAS_ANNOTATION | ANNOTATION |
SBML_FUNCTION | BELONGS_TO | MODEL |
SBML_FUNCTION | HAS_ANNOTATION | ANNOTATION |
SBML_PARAMETER | BELONGS_TO | MODEL |
SBML_PARAMETER | HAS_ANNOTATION | ANNOTATION |
SBML_REACTION | BELONGS_TO | MODEL |
SBML_REACTION | HAS_ANNOTATION | ANNOTATION |
SBML_REACTION | HAS_PRODUCT | SBML_SPECIES |
SBML_REACTION | HAS_REACTANT | SBML_SPECIES |
SBML_REACTION | HAS_MODIFIER | SBML_SPECIES |
SBML_RULE | BELONGS_TO | MODEL |
SBML_RULE | HAS_ANNOTATION | ANNOTATION |
SBML_SPECIES | BELONGS_TO | MODEL |
SBML_SPECIES | HAS_ANNOTATION | ANNOTATION |
SBML_SPECIES | IS_LOCATED_IN | SBML_COMPARTMENT |
SBML_SPECIES | IS_PRODUCT | SBML_REACTION |
SBML_SPECIES | IS_REACTANT | SBML_REACTION |
SBML_SPECIES | IS_MODIFIER | SBML_REACTION |
SBOOntology | isA | SBOOntology |
SEDML | BELONGS_TO | DOCUMENT |
SEDML | HAS_MODELREFERENCE | SEDML_MODELREFERENCE |
SEDML | HAS_SIMULATION | SEDML_SIMULATION |
SEDML | HAS_TASK | SEDML_TASK |
SEDML | HAS_DATAGENERATOR | SEDML_DATAGENERATOR |
SEDML | HAS_OUTPUT | SEDML_OUTPUT |
SEDML_CURVE | BELONGS_TO | SEDML_OUTPUT |
SEDML_DATAGENERATOR | BELONGS_TO | SEDML |
SEDML_DATAGENERATOR | HAS_VARIABLE | SEDML_VARIABLE |
SEDML_MODELREFERENCE | BELONGS_TO | SEDML |
SEDML_MODELREFERENCE | IS_REFERENCED_IN_TASK | SEDML_TASK |
SEDML_MODELREFERENCE | IS_SIMULATED | SEDML_SIMULATION |
SEDML_MODELREFERENCE | USED_IN_DATAGENERATOR | SEDML_VARIABLE |
SEDML_MODELREFERENCE | REFERENCES_SIMULATION_MODEL | MODEL |
SEDML_OUTPUT | BELONGS_TO | SEDML |
SEDML_OUTPUT | HAS_CURVE | SEDML_CURVE |
SEDML_SIMULATION | BELONGS_TO | SEDML |
SEDML_SIMULATION | IS_ONTOLOGY_ENTRY | KISAOOntology |
SEDML_SIMULATION | IS_REFERENCED_IN_TASK | SEDML_TASK |
SEDML_SIMULATION | SIMULATES | SEDML_MODELREFERENCE |
SEDML_TASK | BELONGS_TO | SEDML |
SEDML_TASK | REFERENCES_MODEL | SEDML_MODELREFERENCE |
SEDML_TASK | REFERENCES_SIMULATION | SEDML_SIMULATION |
SEDML_VARIABLE | BELONGS_TO | SEDML_DATAGENERATOR |
SEDML_VARIABLE | CALCULATES_MODEL | SEDML_MODELREFERENCE |
SEDML_VARIABLE | OBSERVES | SBML_SPECIES |
SEDML_VARIABLE | OBSERVES | CELLMLVARIABLE |
Rest API
For demonstration the the Rest API is available using a different, not accessible Neo4J instance. Please refer to:
https://sems.bio.informatik.uni-rostock.de/projects/morre/
Please keep in mind that the database described on the webpage above is only available for REST access.