Data Lakes – The Next Big Thing in Healthcare IT ?

Data Lakes – The Next Big Thing in Healthcare IT ?

In the age when every keystroke on your keyboard or swipe on your phone is tracked the era of Big Data is thriving. The advent of Microsoft Azure in 2008 allowed the Healthcare Industry to finally have access  information that, up until that point, had only been accessible via large companies such as IBM. The ability for the Healthcare Industry to pull information based on mass amounts of accurate data was nothing short of revolutionary.

The advent of this mammoth data machine altered the face of both the for-profit and non-profit sector.  It changed the way nearly all organizations worked and created entirely new industries. With the addition and popularity of mobile applications in the late 2000’s the business of tracking data all but exploded. Soon preventative health was being tackled by companies such as Fitbit which created a personal activity tracker which measures and tracks heart rate, sleep activity and number of steps walked.

Data Lakes - The Next Big Thing

The flood of data coming in, literally, from all corners of the world was organized into countless institutional Data Warehouses. Early industry predictors indicated that this mass amount of data would lead to healthcare researchers quickly uncovering information that could lead to cures or treatments. While this newfound data assisted greatly, flaws in the Data Warehouse concept were soon discovered.

The modern concept of the Data Warehouse began in the late 1980’s. IBM’s Systems Journal article published in 1988 coined the term “business data warehouse”. Bill Inmon (the ‘father’ of data warehousing) began to discuss Data Warehouses as far back as the 1970’s and in the early 1990’s published the industry bible Building the Data Warehouse. Inmon’s model for data warehousing concentrates on a centralized data repository.

Healthcare providers and researchers began to realize that this model meant accessing the data proved much more difficult and often it was not helpful to their research.  The main issue they faced was that the Data Warehouses were designed and controlled by a diverse range of operators. These individual operators could range from hospitals to research centres. These Data Warehouses employed the concept of ‘schema on write’, meaning that the data is organized as it is added to the warehouse. In fact, data is not even loaded until its eventual use is determined. For healthcare providers and researchers this method meant that they had to rely on countless institutions and their respective warehouse designs.  The information culled from disparate Data Warehouses produced at times inconsistent and conflicting data. Also, the ‘schema on write’ method prevented data from being entered in a timely manner; all information would first have to be surveyed and analyzed through individual systems. Healthcare leaders realized what they needed was access to unstructured data that they could analyze on their own timeline.

Data Lakes - DapasoftThe concept of Data Lakes was born.

A Data Lake is a storage system that is able to hold mass amounts of data, but unlike the Data Warehouse with its structured, hierarchical format, the Data Lake holds raw data intentionally eschewing up-front formatting to provide users unfiltered access to the most up to date information. Data Lakes use the concept of ‘schema on read’; data is not analyzed until the end-user accesses it.

Therefore, with Data Lakes at their disposal the Healthcare Industry are not constrained by institutional schemas. While it is logical that hospitals worldwide have created their own Data Warehouses based on their own understanding of what was required by the front-end user, naturally each institutional Data Warehouse would be managed by different teams of people whose intake process for the Data Warehouse can inherently cause wide gulfs in how information is analyzed. In contrast, the Data Lake allows users to pull raw healthcare data unburdened by (if well meaning) ineffective filters.

Data Lakes provide numerous advantages over Data Warehouses for the Healthcare Industry beyond data capture.

Healthcare spending in Canada now runs into the billions of dollars annually. A portion of this cost is infrastructure spending to operate Canadian healthcare institutions including their IT operations and data storage. Adopting the use of Data Lakes greatly minimizes the costs associated with data capture and storage. Not only do operators save costs on the physical assets required for storage, but they can avoid the cost of hiring specialized staff for schematic design and data input.

Data Lakes also allows practitioners to provide patients with Precision Medicine.  Precision Medicine is an emerging medical concept that proposes tailoring healthcare to individual patients. Using Data Lakes and previously mentioned health applications such as the Fitbit personal health tracker, the ability for capturing unfiltered health information from individuals and its timely analysis can now have immediate impact for patients. By its very definition, Data Lakes provide the most open, agile format for end users.

The Healthcare Industry can now take advantage of Data Lakes supported by Microsoft Azure.

Azure Data Lakes will enable the Healthcare Industry to create repositories where their data can be held without constraint. Data of any size or format can be held at a much lower cost, and these savings can be used toward providing improved patient care. Health practitioners and researchers can also access data in real-time increasing the speed in which to apply this knowledge to produce real-world results. The Azure Data Lakes also enable users to invest in new technology without concern that this investment will not sync with their current Data Warehouse.

Big Data provided the Healthcare Industry volumes of structured information that influenced practitioners and researchers alike.  Azure Data Lakes is the bold next step and the future of Healthcare Data.

BORN Changing Maternal Health in Ontario

BORN Changing Maternal Health in Ontario

BORN Conference 2017

BORN Conference 2017 in Toronto during the High Park Sakura

The BORN 2017 conference was in Toronto during the cherry blossoms season in High Park, also know as the Sakura. Over 200+ attendees came to the conference from across Ontario. The keynote speaker was Dr. Neel Shah. Neel is an Assistant Professor at Harvard Medical School and Director of the Delivery Decisions Initiative at the Ariadne Labs for Health Systems Innovation. He is an expert in designing, testing, and spreading system interventions that improve the safety, affordability, and experience during pregnancy and childbirth. He is a co-author of a recently published book on value base healthcare focused on using data to improve the value of healthcare.

Neel and other speakers at the BORN conference presented the importance of data-driven decision making. This approach is not new to healthcare providers and researchers in Ontario using Born Information System (BIS) for researching maternal and perinatal health. Since 2012 there has been an extensive amount of information coming into BIS; roughly about 3,000 entries a day cataloging 140,000 births which account for are about 40% of all births in Canada. The data comes from different sources all playing a critical role in the pregnancy journey. The sources of data to BIS include fertility clinics, prenatal and newborn screening labs, specialized antenatal clinics, birthing hospitals (including NICUs), birthing centers, midwifery practice groups, prenatal and neonatal screening follow-up clinics, primary care settings and autism treatment centers.

This extensive longitudinal health data set is one of the most comprehensive in Canada, if not globally. There are over six KPIs that BORN team and ON healthcare providers have been working on tracking for the last five years. At the conference, Dr. Sandra Dunn presented the research study done by BORN team on maternal-newborn care practices and outcomes across Ontario based on the use of Maternal-Newborn Dashboard (MND), for all hospitals providing maternal-newborn care. The big takeaway is four of the six KPIs have been shown to have a positive impact on the health of Ontario women and children.

The two-day conference was both inspiring and insightful for every Dapasoft team member who attended the event. As an organization, we are humbled to play a small role using our Corolar Platform and health analytics capabilities to help BORN advance healthcare for all Ontario citizens. The Most exciting news from the BORN 2017 Conference was the announcement made by Mari Teitelbaum that during the two days BORN Information System recorded the 1 millionth birth in Ontario!