An Open Clinical Terminology?

(Kev Mayfield) #12

Ta da… in JSON or XML ValueSets

It won’t work on large CodeSystem such as SNOMED or LOINC (or expand SNOMED valuesets because we need a terminology server)

(wolandscat) #13

Well, JSON is one of many lingua francas, and it only works some of the time. Turtle syntax is another, that is more relevant to this situation. Serialised object dump format is always a secondary consideration; the formal models of anything are what really matters. Abstract syntaxes are more important (Java, Ruby, OWL, ADL, etc).

Some things are down to history as well. For example, with archetypes, we invented a format now called ODIN 15 years ago, when there was no JSON, and that does a lot more than JSON. If there had been JSON, we would have had to seriously upgrade it to some sort of JSON2. But today, we can pump out an archetype in JSON, YAML, XML, ODIN and ADL. These formats come and go.

(Adrian Wilkins) #14

Speaking as someone with a medical degree and more than 5 years of experience coding SNOMED CT tools, I don’t really understand SNOMED CT.

The best I could usually manage was to fit enough of the semantics in my head to work on the bit of tooling I was developing. Which is fine if you’re a programmer, because you get to leave chunks of your mind lying around in text files for the computer to use later.

For clinicians, who also have to remember how to medicine? I’m not really sure it’s practical…

(Pablo Pazos Gutiérrez) #15

About Lack of ready readibility/comprehensibility of the ‘code’ the answer is the same reason we actually need terminology systems: language is ambiguous, codes not. Programs need codes, humans need terms and phrases. Codes should be opaque: not processable internally and not hold intrinsic semantics. Codes are mapped to semantics in the terminology systems, and also could store synonyms acronyms, and related terms, so humans can find the correct concept to record, then that is stored as a code. Analytics work over those recorded codes, not over terms and phrases. Also concepts can be used not individually, but ontologically. In SNOMED CT for instance you have expressions to express more generic concepts or to mix concepts together. For instance you have a code for each type of diabetes, but you can say “any type of diabetes” in an expression and use that to query a database and get patients back with that health problem. The same happens with ontologies of drugs. The power is really having all that ontology behind, not on using individual codes directly.

(Kev Mayfield) #16

I’m not going to quote my sources but their is a view that the advanced features of snomed is actually preventing its adoption. If we don’t push them, then adoption and understanding of snomed would increase.

1 Like
(Kev Mayfield) #17

Related post is here, this is concentrating on the technical side HOWTO - HL7v3/IHE XDS OID's, URI's and HL7v2 Tables and new Terminology Support in CCRI

1 Like
(rory) #18

Interesting to read this thread, and I just wanted to add a few things (putting my cards on the table: I work at SNOMED International and so you can guess which side of the argument I fall on!)

Whilst the terminology is licensed (that’s a debate for others to have), the days of the infamous Workbench and other proprietary software mentioned in this thread are long long gone (including the BDB!). We develop a lot of software, all of which is available as Apache v2 open source, and all of which are built in mind to make it easier to use SNOMED CT.

Of probably most interest to those on this thread is our open source SNOMED CT terminology server, GitHub - IHTSDO/snowstorm: Scalable SNOMED CT Terminology Server using Elasticsearch, making it very easy to load SNOMED CT (in 15 minutes), query and access the terminology over both FHIR and more direct REST APIs. There’s a lot of other software available there including that used for the current NHS SNOMED CT terminology browser.

The software sitting in our GitHub repos hopefully helps to make it a little more approachable, and we are always looking to help where we can, so please do reach out to us.

Open Source - for developers
(Kev Mayfield) #19

It sounds exactly like something I’ve been looking for.

Would this work with UK RF2 SNOMED? (Sound like it).

1 Like
(rory) #20

Yes, it should work with any SNOMED CT extension/edition. We’ve not tested with the UK edition, but have done with others, including those with other languages. Now my curiosity is piqued, I’ll try it out and see if it does

1 Like
(Marcus Baw) #21

This is fantastic stuff and is exactly what I had hoped SNOMED would be doing - putting out good and usable, permissively-licensed open source tooling to make it easier to work with. The Dockerization of the stack also really helps for those who don’t want to sully their development machine with Java badness :wink: Top marks for the docker-compose.yml which makes getting the whole stack up and running super easy.

I’m still surprised that there hasn’t been a better way developed for uploading the SNOMED-CT files - ie could not this step be managed with some kind of SNOMED ‘package manager’ or even using a Git server and URLs? I appreciate it’s only a single manual step, but manual steps are the enemy of continuous integration and other forms of automation.


(rory) #22

Yes, absolutely. For now, we’re trying to walk before we can run and have been focussed on making the sure the main functionality is good to go. This terminology server is also the one that we will use for authoring/managing the terminology so it has a rich feature set, most of which not needed for most users.

However, we’ve had requests for this from developers in other countries, so our plan is to develop the functionality to allow devs/users to ‘request’ an edition/extension which will then be retrieved and imported without any other manual intervention, with the correct version being imported (a snapshot version or a delta if there is an existing version in the terminology server).

Hopefully later this year, after we’ve completed full FHIR compliance and other things on our shopping list.

1 Like
(Adrian Wilkins) #23

We got the ICD-10 toolchain going on Git (and the foundations of the UK ICD-10 tools are as a result arguably better than the WHO ones), but doing this for SNOMED CT is a harder problem, not because of the design of Git (the object model of Git is a great design for lots of collaborative works), but because of the underlying limitations of “file system” as a database. We got away with it on ICD-10 because it involved on the order of 10^5 nodes. SNOMED CT is an order of magnitude bigger at 400k nodes for just the core graph, and while *nix file systems are probably OK with that, NTFS and Windows really start to choke on that many files (not to mention, the multiplication of the typical overhead of all the virus checking and scanning tools most IT departments stick on Windows).

It’s a great ambition though. Part of the problem with the (hooray!) defunct IHTSDO Workbench was that it tried to solve the version control problem with 30 year old version control design (internally it was designed like RCS - only without the convenience of abstracting the version control layer away by) - those models were originally designed for version control of single objects and things like CVS are just hacks on top to orchestrate the companion versioning of multiple objects, rather than models like Git which treat revisions as a single composite object.

One of the things I seriously looked at when working on those tools were backend libraries for Git that used something other than a raw file system for storage, and the state of the art in that space may be a lot more mature now.

I had the notion that the systems for authoring and distribution should be not that different and Git seemed like a good choice to stand that on.

The other core design decision that was a problem in those tools was trying to make a generic terminology model that was itself a generalization that would support SNOMED CT. That lead to a great deal of complexity that sat squarely on top of what was already a complex metamodel designed to be very general. I note that snowstorm is billed as a “SNOMED CT terminology server” and not “a terminology server” and I presume (hopefully) that means it’s not trying to be a general terminology server.

1 Like
(Peter Loa) #24

The main problem about open sourcing is the ever changing nature making it impossible to be consistent and comparable over time.

SNOMED is consistent and abhors change without reason and decisions are accepted and used by all their users
I can’t see that happening in the open source world…

(Marcus Baw) #25

I don’t think you’ve fully understood open source, judging by this comment. Are you involved directly in any open source projects? There is still centralised control of any open source project. It isn’t a complete free-for-all.

I would disagree with this quite strongly. Ironically, I am this week involved in some discussions at national UK level about serious and potentially breaking changes that SNOMED International are apparently imposing, which will have huge deleterious impact on our install-base of UK GP Systems. (more on this soon)

Yes the UK has representation at SNOMED International, however this tends to be an expert Terminologist rep, not a clinical rep. The proposed changes, while working towards an ontologically perfect terminology, may significantly undermine the actual primary purpose of the terminology (ie recording clinical care).

(Peter Loa) #26

Not sure that open source would be a good solution rather than changing the governance structure of SNOMED International.

(Marcus Baw) #27

My experience has generally been that “changing the governance structure” of large international organisations is non-trivial.

"Hi SNOMED International, Marcus here. Yes that one. I’m new here and I’m not a career terminologist, but I’m just wondering if you’d mind changing your entire governance structure for me? Aiming to be more community-oriented, consensual, and clinically-relevant. You know, like an open source project?



They hung up - can you believe that?"

(Hans Hendrickx) #28


I have a long answer and a short one. The short one is that I believe that codifying medicine is killing the essence of medicine. William Osler coined the essence as: “Medicine is a science of uncertainty and an art of probability.” During the 50 years I have worked in hospitals all over the world, I always have enjoyed a good letter to amice. The last 20 years at best I had to accept sort of telegram style nonsensical referral letters, mostly the patient was transferred with one-word referral, like ‘headache’, ‘gallbladder’, ‘acute abdomen’. Often the single word had no close relationship to the patient or the complaint, without any mention of context. Dr Thornley just wrote a blog about the “Demise of Medicine”. It is a disaster waiting to happen?

Surprisingly so, I do like ontologies, because if designed well they represent at each step a question/decision/answer. The problem is the uncertainty we have to deal with. So, the idea of Diagnosis Related Imbursement of doctors is absurd, and yet everywhere in the world entertained by insurance companies and political parties. ICD10-11 is very popular for this purpose, even though it is designed for classifying causes of death, not daily practice. No surprise, in over 40% of death certificates, pathologists cannot relate the text in the certificates with reality, diagnostics in real live are even worse. Relating ICPC’s and ICD10-11 is impossible, and an example of the serious gap between the GP-bubbles and those of specialists. My conclusion for long is that we have developed natural language over 1000000 years, and 60 years of codifying has been a very nice experiment, which now kills the essence of medicine, the dealing with uncertainty.

The right diagnosis can be pinpointed in 80% of cases by good Medical History Taking. We need natural language for that and smart questions. This is how doctors (should) think, based on symptoms and signs the patient can report and show. That is called communication. The gathered data should be collected into a casebook with a pattern every doctor should have learned in medical school. In my own experience nowadays, visiting a doctor means an encounter a person who is glued to a square screen, does not introduce 'it’self, and often turns out to be a nurse or aid. NIH informs patients that a large part of medical care is provided by nurses. In the UK this has been ‘codyfied’.

As an anesthesiologist I deal with easy work, which if it goes wrong has serious consequences. This is the typical business model for insurance revenues, dealing with incidents with high impact. That has triggered me into my quest for intelligent, smart and efficient medicine. IT has a lot to offer, because smart, intelligent and dynamic questionnaires are able to extract very useful data from patients, and those data can be translated into structured and patterned information doctors like. This is the essence of my current work, MediPrepare Open Source Project. Every doctor now could create Expert Medical Systems with our tools by creating questionnaires for all 130+ specialties and incorporate their expert knowledge. The data can be translated into valuable information leading up to a differential diagnostic path which can be started by the patient.

So, I believe that using the route of natural language in Medicine for many years to come will be superior to communicating in digitized codes. Eventually smart computers will be able to dissect our natural language to the point that we can let them communicate by digits. For now Codifying Medicine is killing patients and doctors. In the USA third cause of death is medical mishaps… Maybe we need meaningful IT, created by close cooperation of doctor and programmer, like was used in the Caduceus Project at Pittsburgh University Hospitals around 1980.

My 5 cents, Hans

1 Like
(Adrian Wilkins) #29

Hans, thank you for so eloquently stating an opinion I arrived at after close to a decade of maintaining healthcare code system maintenance tools, which I carry forward into my opinions about EHR software - it’s all about communication, and the endpoints that really matter are the humans.

If you’re going to communicate with the limited endpoints of an API to elicit a service, sure, codify your inputs. But to me, the vast bulk of the drive to codify and structure health data is from the (noble, but also potentially profitable) desire to mine it for data, rather than the desire to serve an individual patient better.

As you say, getting the right code from ICD-10, which in the UK is around 14,000 codes, is hard enough. Having your insurance payment depend on having chosen the right code is harsh.

I hear ICD-11 is far more complex. But probably a game of tic-tac-toe next to using SNOMED CT for the same purpose.

(Kev Mayfield) #30

Do any secondary care systems code as you type??

So when type ‘patient has asthma’ it automatically prompts you to select a code for asthma?

I’ve not seen it. I’d always had the impression, doctors would code items as it made it easier for them to drill into the medical record at a later time (in primary care). As a side effect it enabled reporting. [However in other sectors, codes seem to be done primarily for reporting, not care]

Clinical Autocomplete / Autosuggest - how not to do it
(Marcus Baw) #31

I’ve got so much to say on this I’ve started a new thread so as not to hijack this one.