Screen Shot 2022-02-16 at 3.39.03 PM

Magic Data Lakes

A recent POLITICO article describes how “virus hunters” (public health experts charged with tracking infectious diseases like COVID-19) are forced to use archaic, 20th century technologies to investigate cases and share their findings: fax machines and Excel spreadsheets. Granted, there is no mention of typewriters and rotary phones, but still, it’s reasonable to expect much better. So, what’s the problem? Why is it that we can withdraw money from an ATM anywhere in the world, but we can’t easily share COVID-19 lab results between hospitals located across the street from one another? In other words, why can’t healthcare information technology “interoperate”?

I am routinely annoyed when people ask me the ATM question (wait, didn’t I just pose that question a few sentences ago?) I find the attempt to equate ATM data and electronic health record (EHR) data to ring hollow. Banks are exchanging some numbers. Deposit a check for $500.00 into account 12345 at bank 84934. Withdraw $40 from account 44325 at the same bank. Am I oversimplifying here? Perhaps, but not by much. Exchanging health data is exponentially more difficult. Why is this the case? Mostly because we’ve made it that way.

Banks are highly regulated by the government, and for good reason. Consumers must know that a savings account has certain core characteristics, and those characteristics will always be found in savings accounts from any US bank (FDIC-insured so the deposits are safe, interest rates described in standard, understandable ways, etc.) While some eschew all regulation, without standards and rules, there is no basic trust. And without trust, we can’t trade or carry on business.

Am I implying that healthcare isn’t regulated? Far from it; we’re highly regulated . . . in some areas. Insurance companies and the federal government tell physicians what they must document in order to charge for various services. Quality organizations dictate minimal standards for how an operating room must be cleaned, stocked, and utilized. But when it comes to how we order tests, report the results of those tests, or record what diseases or problems a patient may have, most standards go out the window. A physician at Hospital A may order a test called “SARS-CoV-2 RNA” but at Hospital B, the doctor orders a “Novel coronavirus, qualitative, RT-PCR” test. Same thing? Perhaps. Perhaps not. Someone better figure that out because those results are sent to the local health department.

Dr. John Lee, the CMIO at Edward-Elmhurst Healthcare and someone you should follow on Twitter, put it this way (reprinted with permission): “From an informatics perspective, the lesson that I hope we learn from [the COVID-19 pandemic] is that we need a way to rapidly abstract, normalize, and aggregate information. This goes beyond the concept of ‘interoperability’ or at least ‘interoperability’ as it has been portrayed in both the lay and [healthcare information technology] industry press.  I have a sense that there is a notion that vomiting all of our patients’ data in a magic data lake will yield the knowledge that we are all seeking. However, that is like pouring crude oil in your car and expecting it to run.”

Like you, I’m dismayed that magic data lakes aren’t real! Even with our very expensive and very good EHRs, it’s not reasonable to expect that we can send the same result to the same health department, use  very different names for said results, and expect a good outcome. Heck, as referenced by the POLITICO article, not only are we calling the results by different names, many of us can’t even send the results electronically from the hospital’s IT system to a local health department’s IT system (assuming they have an IT system – there’s not a lot of profit in public health, at least not yet.) Hence, hospitals send results of pandemic testing to health departments via fax, and many of those health departments use off-the-shelf spreadsheet software to tally results. What could go wrong?

How can we fix some of these problems? Two solutions are right there in front of us. We could, like the banks, have federal regulation that dictates the naming of lab tests, how those test results must be displayed in the EHR, and how the results must be packaged up and sent to those who need to see the results (e.g. the patient, the local health department, the CDC, etc.) Alternatively, we could have a standards-setting organization (see my recent blog post referencing Unicode) mandate that we “translate” our terms into their terms before sending off data. This is called normalizing data, and I’ll be writing about that process in my next post!

Whether we bow down to a higher authority or normalize our data so it’s actionable, what’s clear is that the current state of affairs can’t be maintained for long. We don’t need a supercomputer or Einstein to figure out this problem. It’s in everyone’s interest to share these data in near real time, and we can do it if the will is there.