This video shows how to use Snow Owl for semantic queries in ESCG (Extended SNOMED CT Compositional Grammar). We will run a few sample queries to get familiar with different kinds of expressions. I will also show how to create query files and use content assist to help writing scripts.

This is the last video part of our three-part series about searching. You can find the other videos of this series at:

Once you’ve seen the video, you can find additional documentation at:

Transcript

Hi! This is Sonja from B2i Healthcare.

Today I would like to show how to use Snow Owl for semantic queries in ESCG, which is the Extended SNOMED CT Compositional Grammar. We will run a few sample queries to get familiar with different kinds of expressions and operators. I will also show how to create query files and use Content Assist to write script.

Snow Owl includes an editor and execution environment for ESCG expressions. This is the editor. ESCG is a formal grammar to compose expressions that include operators and defined concept identifiers. You can see the operators are red, here’s one. The concept identifiers are displayed in black. What you see in blue with the green bars is optional text. It means that this will not be processed when running the query, in practice it’s usually added to facilitate reading an expression.

All of the operators and grammar constructs are supported as defined in the NHS Logical Record Architecture specification, which is an extension of the HL7 Term Info specification. ESCG expressions are useful because you can query concepts by their relationships as opposed to their human readable descriptions. For example, this query here retrieves a relationship of a finding site in the structure of the cardiovascular system, so you can use it for semantic querying.

Here is an overview of the different operators that we will be using today, this is what appears in red in the query script. To find more detailed information, please use the online help at our website at www.b2i.sg. There’s also a “Getting started with semantic queries” tutorial there, which guides you through a series of sample queries.

 

Let’s go back to Snow Owl.

Query files are located in the Project explorer view. I already prepared some sample queries. This here is a project folder: If I open it, you can see the different queries – they all have an .escg extension.

If you want to create query file, you can either do it in a folder that already exists or you create a new project folder and then you create a query file in this folder. Let me show how to create a folder. Just right-click on the view and go to “New project” and then enter project name, for example “My queries“. Then click “Finish” and here’s the new project folder. If I right-click this folder, I can go to “New file “and enter a name for the file. It’s important to always end with .escg because that is the extension. If you don’t add this, it will also create a file but it’s not going to be an .escg-file.

If I open this here (I double-click it), we can see the empty editor where I can enter a query script. It works just like a normal text editor. There’s an error message because this is an incomplete query. It might be useful to save queries, for example if you import a new release and want to rerun the query with the new data. Files can also be deleted. Just right-click and go to “delete”, it’s very easy. You can also delete folders this way.

If you’re using the free version of Snow Owl, you will find a folder here, which is called “B2i examples”. It contains sample queries and some explanation about the different expressions, so it’s like a mini tutorial that we included.

OK, now let’s look at a query. This is a pretty complicated one. Let’s start with an easier one. This one has only one concept in it: You see in the operator in red. This operator retrieves the concept behind it, which is Clinical Finding and all of its subtypes. Behind the operator you always need to enter the concept identifier. If I run this query, it will retrieve all Clinical Findings, which are almost 100,000 concepts.

It’s very easy to run the query: I just click this button here (this is the Execute button) and it executes my query script. This takes a bit because it’s a very big query with lots of results. Here are the results displayed in the search view. You see I have 99,669 results, which are the children of Clinical Finding plus the parent concept Clinical Finding that is included as well. This is why it has an extra match.

You might already be familiar with the search view from other videos. There’s a filter search here, so we can filter for a particular term. If I double-click on one of the matches, the editor opens up and we can see information about this particular concept. If you have a reference set open, let me just open this one, you can also add a search result to the reference set or you can even add all of the results to the reference set. We can also add bookmarks here, mark different results and add them to the reference set so it’s quite useful.

Let’s go back to our query. I close this one here and try to do this with a different concept. Since it’s an editor text, we can simply delete concepts here. Whenever there’s a problem you will see this little symbol here – if you hover over it, it displays more information: This one is incomplete so I need to add a concept here.

There are different ways of adding a concept. You can enter the concept ID, or copy paste the concept ID, or take a concept here from the view and simply just drag it into the editor and it automatically adds the ID and also the name of the concept. So now this query will retrieve all Events and the root concept Event as well, so we should have exactly 3,664 results. This is it here and if I open this, this is all Events. If you would like, let me just show you the root. The parent concept is here as well, here it is so this is “Event”.

If we wanted to exclude this, we have to delete one of those symbols to use the < subtype operator. If I run the query again, I get one result less and if I do filter for “event” it’s not there anymore, so the parent concept is not part of the results anymore. OK.

If you have the symbol twice << it retrieves the concept and its subtypes. If you have only one < it’s only the subtypes without the parent concept.

OK, let’s look at a different one. I made a change to the query and this little star symbol appears here, which means there are unsaved changes. If I would like to save it, I would just click this one here and then the query is saved.

Let’s go to the next one, which is how you retrieve reference set members. So far we’ve learned how to retrieve concepts and their descendants. For reference sets, you have to use a different operator, which is this one ^a little caret operator. This query retrieves the non-human simple reference set, which I already opened here. As you can see, it has 1,915 members (let’s close this one). Now let’s execute our query. We get the same number of results here so these are all veterinary terms that are in this reference set.

It’s also possible to combine two of these expressions. Let’s look for Clinical Findings (I saved it so you can just copy and paste, let’s paste it here) and combine these two queries: All Clinical Findings that are also a member of the Non-human simple reference set. You see that something is missing here there’s a problem, which is when Content assist comes handy.

Just press Ctrl + Space to bring up the content assist menu. It shows only operators that are valid at this particular point of the query so wherever your cursor is. I have only three options here, I will use the first one with the plus sign + for intersection. Let’s run it.

Now this query retrieves Clinical Findings that are also a member of the Non-human simple reference set. If we look at the reference set, it would be all of these here but not the ones that are not a Clinical Finding like these, so if I type for instance “hoof” you see only the Clinical Findings come back but not the Procedure so it’s what we wanted.

OK, let’s look at a different operator that we can use here. I use again Ctrl + Space for the Content assistant, and select UNION. So far we narrowed our search, in contrast to that UNION broadens the search. You can also use space in between or start a new paragraph, which makes it a little bit easier to read but has no influence on the query. Now we would like to retrieve all Clinical Findings plus all members of the non-human reference set so a lot more results so we should have over 100,000 results. This one also now includes the Procedures. If enter “hoof” again, we see that all Procedures and Body Structures are as well included.

What you would like to do in practice is probably not include a veterinary term but exclude it. There’s an operator for excluding. I will use the query I had before (the intersection) and use content assist to select the exclamation mark ! It excludes the members of the Non-human reference set so these are all Clinical Findings but not the veterinary terms. If I run this again, I should not find any veterinary terms. Let’s look at the results and search for “hoof” – as expected now all of these terms are gone. You can also exclude a sub-hierarchy this way (let me just remove this one). I would like to exclude the sub-hierarchy Disease, I just move it over and run this query disease. Disease is a pretty big sub-hierarchy, it´s over 66,000 concepts and Clinical Finding is over 99,000. I should get a lot less query results. You see it’s only 33,000 because this whole sub-hierarchy was excluded from the query. OK, so this was the exclamation mark.

Another expression that is quite useful is something that looks like this, it’s a combination of two operators. One is this, that’s a refinement, so the colon (:) then there’s the equal sign (=) which is the attribute value operator. This query is looking for bacterial infectious diseases (so this one and descendants) that have a finding site in the lung. We would probably call this bacterial infectious diseases of the lung, but in ESCG terms it would have a finding site relationship with the target concept being lung structure, this is the equal sign. Let’s execute this query, and open one of the results where we can see that it has a finding site of the lung structure.

We can even narrow this query by entering a comma here as a separator. Let’s add a causative organism. I’m going to use content assist again (Ctrl + space) and now I will just add a causative agent, I just select and hit return and then =

Now let’s look for an organism if I want to look for a concept in content assist, I go to Content > Add new concept from picker and then hit return. This brings up the Quick Search that you might be already familiar with. It’s the same as when you are just searching for concepts up here. Let’s type in the causative Agent, this is an organism. I click it and it automatically adds it to the concept ID. Now this query is for Bacterial infectious diseases of the lung with Streptococcus pneumonia as a Causative agent.

This query should be a lot smaller, you see there are only three results. Let’s open one of those: Here’s the Causative agent, this is the relationship that is specified here and we have Finding site of the lung structure. This is quite neat because you can add as many attributes as you want this way.

Two last operator I would like to show are AND or OR. They are always used on the right side of the refinement expression. For example, if we wanted to combine two queries, we would use + or UNION depending on what we wanted do. However, if you are within this refinement expression you have to use AND or OR. So what does AND do? This query is Clinical findings within Associated morphology of Ulcer AND Inflammation so both of these criteria have to be met. Let’s run this and see the results. Associated morphology: Ulcerative inflammation, so both of the criteria were met. I narrowed the search to 89 results.

Now let’s use OR. This means we’re looking for Clinical findings within associate morphology of either an Ulcer OR an Inflammation. We should have a lot more than 89 results. It’s over 6,000. Let’s look at this one Inflammation and let’s see if I can find something with ulcer. It’ll be probably easier if I just filter for “ulcer” and here the Associated morphology is Ulcer.

OK, I hope you enjoyed this little introduction. Thanks very much for your attention. Bye-bye.