This repository has source codes for reproducing methods and analyses in the following paper:
COHD-COVID: Columbia Open Health Data for COVID-19 Research
Junghwan Lee, Jae Hyun Kim, Cong Liu, George Hripcsak, Casey Ta, Chunhua Weng
Preprint
- Update settings in SQL Server Manangement Studio so that Results to Text saves tab-delimited files at Tools > Options > Query Results > SQL Server > Results to Text -Output format: tab delimited -Include column headers in the result set: enabled Then restart SSMS for new settings to take effect
- Enable SQLCMD mode at Query > SQLCMD Mode
- Execute the query you wanted to export in /sql_queries directory with updated output path after :OUT command.
analysis.py contain functions to calculate the concept count, concept co-occurrence, and symptom prevalence based on the raw data exported from the database. prevalence_example.ipynb contain examples to use the functions in the package.
analysis.py contain functions to perform various analyses based on the concept count, concept co-occurrence, and symptom prevalence results. analysis_example.ipynb contain examples to use the functions and to perfrom analyses in the paper.