This chapter will help you to become familiar with ESCG by guiding you through a series of sample queries examples. These and other examples are also available in the B2i examples folder in the Project Explorer View.

ESCG editor

Snow Owl includes an editor and execution environment for Extended SNOMED CT Compositional Grammar (ESCG) expressions. ESCG is a formal grammar to compose expressions that include operators and defined concept identifiers. It can be used for semantic querying. All of the operators and grammar constructs are supported as defined in the NHS LRA specification, which is itself an extension of the HL7 TermInfo specification. ESCG expressions are useful because you can query concepts by their relationships, as opposed to their human readable descriptions.

Opening the editor and creating a query file

To create and edit ESCG expressions you can either use

  • the embedded ESCG editor which is part of the Advanced Search dialogue, or the
  • the standalone ESCG editor which can be accessed from the Project explorer View.

Embedded ESCG editor

The embedded ESCG editor is called embedded because it is part of the Advanced Search and can be accessed through the search dialogue. To start the editor:

  1. Click the search button in the main toolbar to bring up the search dialogue. For doing advanced searches, please refer to the Search dialogue section.
  2. Choose the ESCG Query tab in the SNOMED CT Concept Search field to open the ESCG editor.

Embedded ESCG editor

Even though the embedded editor has the same query functions as the standalone editor, there are a few limitation:

  • Concepts can not be dragged and dropped into the editing field.
  • Queries can’t be saved. If you want to re-run a query, you should use the standalone editor to save your expression.

Standalone ESCG editor

Before you can start using the standalone ESCG editor, you need to create a project and a file with an .escg extension.

To create a new project right-click somewhere in the Project Explorer to bring up the context menu.

  • Select New > Project (see screenshot below), then General > Project, and click Next.

Opening the New Project wizard

Specify the project name, then click Finish. Your new project will now appear in the Project Explorer view and you are ready to create a file with an .escg extention.

  • Right-click on the project and select New > File.
  • Name your file and add .escg at the end of the name. If you don’t add this extension the ESCG editor won’t be associated with it. Click Finish to create the file.

Specifying the file name with .escg extension

Double-clicking the new .escg file will launch the standalone ESCG editor.

Standalone ESCG Editor

Working with the ESCG Editor

Entering the script

Since the query editor is a text field, you can simply enter your query script. However, it’s much easier to use the content assist:

To bring up the content assist menu, simply hit Ctrl + Space. The menu will only display selections that are valid, e.g operators that can be used at the active part of your query. The selection displayed by the content assist is based on the position of your cursor.

Content assist in the ESCG Editor

There are two different ways of adding a concept to your query script.

  • If you are working in the standalone version, you can find the concept in the SNOMED CT view and drag and drop it into the ESCG editor.
  • If you are using the embedded editor, bring up the content assist and select Concept – add new concept from picker, this will open the quick search where you can enter your concept name. Once you found your concept, just select it and hit return to add it to your script. You can use this feature also in the standalone version.

Tip: Any concept within the ESCG expression can act as a hyperlink if you hold down the Ctrl ( Cmd on Mac OS X) key and hover over a concept. Clicking on the hyperlink reveals the concept in the SNOMED CT Concepts view and opens it in an editor.

Colored Syntax, Validation and Quick fix

Apart from the content assist, we included a few more features that facilitate semantic querying.

Queries are validated instantaneously as you enter the expression:

  • Red squiggly lines indicate erroneous parts in your expression, as well as a red error marker on the left margin of the editor. If you hover over the red error marker, a tooltip will give you more information about the nature of the problem.
  • Warnings are shown with yellow squiggly lines and an exclamation mark on the left margin of the editor.

By clicking on the Quick-fix icon on the margin of the editor, the incorrect ID can be easily fixed by picking the correct ID or description from the SNOMED CT store (see screenshot below). Please note that quick fix is only available in the standalone editor.

Validation and quick fix for incorrect concept

To query text is colored based on the syntax to make it more readable. The most important default colors are:

  • Concept ID: black
  • Operators: pink
  • Vertical bars used for optional text: light green

You can customize your syntax colors at the respective preference page with is accessible via the main menu File > Preferences > Snow Owl > ESCG Editor > Syntax Coloring.

Syntax coloring

Executing a query and saving the script

To run the query, click the Execute button on the main toolbar. If you are using the embedded expression editor of the search dialogue, you need to click the Search button to perform your query.

It might be useful to save your query script, so that you can easily update your search results when your release data are changing. Unsaved changes are indicated by a small asterisk next to the title. Just hit the save button in the main toolbar to save your query script. The embedded editor can only be used to execute queries, not for saving the script. If you are planning on saving your script, you should create an .escg file in the standalone editor.

If you want to create another query, you need to create a new .escg file as described in the previous section. There is also a copy and paste function available in the context menu, which might be handy if you want to re-use parts of the script for a new query. To bring up the context menu, right-click the file you want to copy and select Copy. Now select the folder you want to copy the new file into, right-click and select Paste. This will create an identical copy of your query script which you can for editing.

Results of the query

The results of your query will be displayed in the Search view where you can see the number of results and the execution time. You can filter the results in the text field on the top of the view or sort them by clicking on the top of the column (e.g. sort by ID). Use the context menu to bookmark your query results or to add them to a reference set.

Search View displaying the results of a query

Do you speak ESCG?

No? Don’t worry, this section will give you a step by step introduction to the Extended SNOMED Compositional Grammar. If you already know ESCG and just want to have a brief overview, we recommend the summary at the end of this section.

The three basics: Operators, IDs and optional text

Let’s start with a simple query that will retrieve all SNOMED CT concepts:

<<138875005|SNOMED CT Concept|

This expression has three different components:

  • An << operator, which has the function to retrieve the concept (behind the operator) and all of its subtypes
  • A sequence of digits, which is SNOMED CT concept identifier: 138875005
  • Optional text between vertical bars: |SNOMED CT Concept|.

When you run the query, you will get over 305.000 results, depending on what release of SNOMED CT you are working with.

For computing purposes, the concept ID (138875005) and the operator (<<) are sufficient. Text between vertical bars is optional, this means it will not be processed when running the query. In practice the name of a concept is usually added as optional text. This facilitates reading an expression, specially when you are working with several concepts.

Retrieving concepts and reference set members

As you already know, the << operator which will retrieve the concept and all of its subtypes. If you want to retrieve only the subtypes of a concept but not the concept itself, you have to use a the < operator. Try running this query and look at the number of results

<138875005|SNOMED CT Concept|

You should have retrieved one concept less than in the previous query because the SNOMED CT root concept was excluded.

These operators work for any SNOMED concept. If you want to retrieve all clinical findings, use this query:

<404684003|Clinical finding|

These operators are not restricted to the focus concepts of the expression, you can also use them at the relationship type refinements (e.g.

The caret operator ^ will list the members of a reference set. Here is an example for retrieving the members of the Cardiology reference set:

^152725851000154106|Cardiology reference set|

Intersection and Union

The intersection operator (+) is used on the left hand side of the expression (before the colon) to combine focus concepts and provide a domain intersection, to which further refinements can apply.

This query will retrieve Clinical findings that are also members of the Cardiology refset.

<404684003|Clinical finding| + ^152725851000154106|Cardiology reference set|

If you want to do the opposite and get all Clinical findings except the members of the reference set, you would put the exclusion operator (!) in front of the part that you want to exclude.

<404684003|Clinical finding| + !^152725851000154106|Cardiology reference set|

The UNION operator combines the results of two queries. To find all Diseases and all Procedures, you would use the following query.

<<64572001|Disease| UNION <<71388002|Procedure|

It also works for combining queries that are more complicated like this one:

<404684003|Clinical finding| + !^447564002|Non-human simple reference set| : 
    363698007|Finding site| = <<113257007|Structure of cardiovascular system|
UNION
<<71388002|Procedure| + !^447564002|Non-human simple reference set| : 
    <<363704007|Procedure site| = <<113257007|Structure of cardiovascular system|

It retrieves all clinical findings and procedures that are related to the Cardiovascular structure and are relevant to humans. You can see that the Non-human reference set was excluded in both queries. Structure of Cardiovascular system was defined as the Finding and the Procedure site, respectively.

Exclusion

When you want to omit concepts or members of a reference set from your query, you can use the ! operator. It will exclude the concept behind it.

If you are only interested in concepts that are relevant to humans it’s useful to exclude the members of the non-human reference set. To do this you would use the !^ expression, which is a combination of the ! exclusion operator and the ^reference set operator, it will omit all members of the reference set.

  • !^447564002|Non-human simple reference set|

You can also omit a sub-hierarchy by using the !< expression. It will omit this concept and all of its children. For example, this expression will exclude the sub-hierarchy of diseases from your query results:

  • !<64572001|Disease|

Now let’s look at some examples to see how these expressions are used in context. You want to find all Clinical findings that are not a Disease

  • Retrieve all clinical findings including the top-level concept <<404684003|Clinical finding|
  • Exclude the sub-hierarchy disease !<<64572001|Disease|
  • Use the intersection operator (+) to connect the two parts.

The full query is

<<404684003|Clinical finding| + !<<64572001|Disease|

It works the same for members of a reference set: Let’s exclude veterinary concepts from clinical findings.

  • Retrieve all Clinical findings including the top-level concept <<404684003|Clinical finding|
  • Exclude the Non-human reference set !^447564002|Non-human simple reference set|
  • Use the intersection operator (+) to connect the two parts.

The full query is

<<404684003|Clinical finding| + !^447564002|Non-human simple reference set|

You can also use the exclusion to express negation, e.g. look for a set of concepts that do not have a particular relationship and value in their definition.

This query

<<404684003|Clinical finding|:
    246075003|Causative agent| =  !<<409822003|Bacteria|

will return all subtypes of Clinical finding that do not have a Bacteria causative agent. These concepts either do not have any causative agents at all, or they have causative agents that are other concepts but the Bacteria.

Refinement

The refinement operator (:) is usually used in combination with the attribute value operator (=). These operators are useful when you want to restrict a query to concepts with certain attributes. For example, you can look for all Clinical findings that have a Finding site relationship with the target concept being the Cardiovascular system.

<404684003|Clinical finding|:
    363698007|Finding site| = <<113257007|Structure of cardiovascular system|

Let’s look at another example: You want to find all bacterial infectious diseases of the lung. You would query for:

  • All bacterial infectious diseases <
  • Use the : refinement operator to further restrict your query”<
  • and define the lung (body structure) as a finding site (attribute): 363698007 |Finding site|=<

Make sure to use the << operator to include the children of the lung structure.

The entire query looks like this:

<<87628006|Bacterial infectious disease|:
    363698007|Finding site| = <<39607008|Lung structure|

You can narrow this query to a certain kind of bacterial infection by adding the causative organism. In our example, we will use Streptococcus pneumonia as a Causative agent. To add an expression, just use a comma (,) as a separator. This query retrieves bacterial infectious diseases of the lung caused by streptococcus pneumonia.

<<87628006|Bacterial infectious disease|:
    363698007|Finding site| = <<39607008|Lung structure|, 
    246075003|Causative agent| = <<9861002|Streptococcus pneumoniae|

Let’s do a more advanced query: How would you search for congenital autoimmune disorders?

  • We need Clinical findings, that are
  • Congenital, in SNOMED CT world this is expressed with an Occurrence relationship pointing to the target concept: Congenital
  • and have a Pathological process associated with it that is described as Autoimmune
<<404684003|Clinical finding|:
    246454002|Occurrence|  = 255399007|Congenital|,
    370135005|Pathological process| = <<263680009|Autoimmune|

AND and OR

AND and OR are always used on the right hand side of the expression, at the refinements. You can use them to specify the relationship targets for your queries, e.g.:

Let’s take a look at a query that uses OR

<<404684003|Clinical finding|:
    116676008|Associated morphology| = <<56208002|Ulcer| OR <<118622000|Fistula| 

It is searching for all the Clinical findings, that have an Associated morphology relationship, which has the target of either Ulcer (or subtypes) OR Fistula (or subtypes). This allows you to extend your search results and specify more allowed targets for the same relationship type. It is using an union of the two relationship targets.

AND is used similarly, but while the OR broadens your search results, the AND narrows it by specifying an intersection of the two relationship targets. For example:

<<404684003|Clinical finding|:
    116676008|Associated morphology| = <<56208002|Ulcer| AND <<23583003|Inflammation|

This query is searching for findings that have an associated morphology relationship, which has a target that is both an Ulcer and an Inflammation (e.g. Ulcerative inflammations).

Summary: ESCG operators

Operator Function
<< Retrieves the concept and all of its subtypes
< Retrieves all subtypes of this concept, but not the concept itself
|text| Displays Preferred term of the concept to aid readability
^ Retrieves all the members of this reference set
+ Retrieves only concepts that are results of both expressions (intersection)
UNION Combines the result set of two queries
!^ Excludes members of this reference set
!<< Excludes this concept and all of its subtypes
!< Excludes this concept’s subtypes
= Defines an attribute refinement, e.g. a finding site or a causative agent
: Refines an attribute range, operator is used in combination with an attribute
AND Used to express intersections of attribute ranges
OR Used to express unions of attribute ranges

<< Retrieves the concept and all of its subtypes

<<138875005|SNOMED CT Concept|

< Retrieves all subtypes of this concept, but not the concept itself

<138875005|SNOMED CT Concept|

|text| Include preferred term or description to aid readability

|SNOMED CT Concept|

^ Retrieves all the members of this reference set

^152725851000154106|Cardiology reference set|

+ Retrieves only concepts that are results of both expressions (intersection)

<404684003|Clinical finding| + ^152725851000154106|Cardiology reference set|

UNION Combines the result set of two queries

<<64572001|Disease| UNION <<71388002|Procedure|

!^ Excludes members of this reference set

!^447564002|Non-human simple reference set|

!<< Excludes this concept and all of its subtypes

!<<64572001|Disease|

!< Excludes this concept’s subtypes

!<64572001|Disease|

= Defines an attribute refinement, e.g. a finding site or a causative agent

363698007|Finding site| = <<113257007|Structure of cardiovascular system|

: Refines an attribute range, operator is used in combination with an attribute

<404684003|Clinical finding| :
    363698007|Finding site| = <<113257007|Structure of cardiovascular system|

AND Used to express intersections of attribute ranges

<<404684003|Clinical finding|:
    116676008 |Associated morphology| = <<56208002|Ulcer| AND <<23583003|Inflammation|

OR Used to express unions of attribute ranges

<<404684003|Clinical finding|:
    116676008|Associated morphology| =<< 56208002|Ulcer| OR <<118622000|Fistula|