{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "![Urgences - Image CC0 - pexels.com](img/pexels-pixabay-263402.jpg \"Urgences\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Biology Order Prescription\n", "*Levi-Dan Azoulay* \n", "*Ali Bellamine*" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Présentation du projet" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Chaque jour, environ 50 000 personnes se présentent dans un service d'accueil des urgences (SAU) en France. En moyenne, 75% des patients retournent à domicile, et 20% sont hospitalisés. La durée moyenne de présence au SAU est longue. On estime que seulement 20% attendront moins d'une heure, tandis que ~30% attendront entre 1h et 2H et ~30% attendront en 2 et 4H. Enfin, un peu plus de 10% resteront au SAU entre 4 et 6H. Dans un contexte de pénurie de soignants, le recours à la consultation au SAU est en constante augmentation depuis plusieurs années. L'optimisation du circuit des urgences est une problématique centrale. Le cout humain et financier des dysfonctionnements du circuit et de l'offre de soin est important. \n", "\n", "Le parcours classique du circuit des urgences est le suivant : \n", "1. **Premier contact d'ordre administratif**\n", "2. **Premier contact soignant avec une infirmière d'accueil et d'orientation (IAO) (~M30) avec** :\n", " - Recueil du motif de consultation\n", " - Prise des constantes\n", " - Recueil de quelques antécédents et de l'ordonnance du patient \n", " - Eventuellement ECG \n", "\n", "\n", "Le patient est classé selon un score de gravité (bleu, vert, jaune, orange, rouge, ou 1-2-3-4-5)\n", "\n", "3. **Premier contact médical avec un médecin (~H1)** :\n", " - Interrogatoire\n", " - Examen clinique\n", "\n", " \n", "A la suite de cette consultation, plusieurs cas de figures selon la situation.\n", "Le patient peut sortir avec ou sans ordonnance si le diagnostic est posé par l'examen clinique et ne nécéssite ni examen, ni hospitalisation.\n", "Le patient peut nécéssiter la réalisation d'examens (prise de sang, radiographie, scanner) ou motiver un avis d'un spécialiste. Auquel cas il doit attendre\n", "\n", "4. **Réalisation des examens complémentaire ou d'un avis (prescription, réalisation, récupération)**\n", "5. **Décision finale : Conclusion une fois les examens récupérés. (~H3)**\n", "\n", "Entre chaque étape, le patient attend pendant une durée plus ou moins longue. Le médecin lui « jongle » avec plusieurs patients à la fois à des étapes différentes. \n", "\n", "Nous proposons d'aider à raccourcir le temps entre l'arrivée du patient et sa sortie, en ne subordonnant pas la décision de réaliser un examen biologique à l'examen clinique du médecin. Nous savons que le temps entre l'arrivée au SAU et la première visite avec le médecin est le temps le plus long et le plus mal vécu par les patients. \n", "\n", "Nous proposons à l'aide d'un algorithme d'apprentissage statistique de prédire, dès les données fournies par l'IAO, la nécéssité de réaliser un examen de biologie médicale, afin de permettre aux IDE de prélever cet examen juste après l'IAO, de sorte que le médecin dès sa première visite peut conclure avec les résultats de la biologie, qu'il aurait sans cela, demandé et attendu de récuperer avant de conclure et de prendre en charge le patient. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Méthode et Objectif" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Données d'entréeAlgorithmeDonnées de sortie
Vecteur {0,1}^d d'examens de biologie associée à sa réalisation (1) ou non (0)
AgeMLP
NLP (Embeddings, Word2Vec ...)
Autres
Ionogramme Complet - {0,1}
SexeBilan hépato-cellulaire - {0,1}
Motif de consultationNumération sanguine (NFS) - {0,1}
Paramètres vitaux (FC, SpO2, PA, T°, FR, EVA)Glycémie - {0,1}
Ordonnance d'entrée du patientHémostase - {0,1}
...
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Plan du documents" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les étapes clés de ce projet sont les suivantes :\n", "\n", "- I. Identifier une base de données exploitable – Verrouiller la base \n", "Base MIMIC-IV ED – Verrouillée et disponible via le présent dépôt\n", "- II. Explorer et visualiser les données \n", "Recherche de corrélations, visualisation des données, description et analyse des inputs et output features\n", "- III. Sélection des variables d'interêts\n", "- IV. Définition et entrainement d'une solution d'apprentissage statistique" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# I. Identifier une base de donnée exploitable - Vérouiller la base" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les données sont issus du projet MIMIC-IV. \n", "Le projet MIMIC est un projet d'open-data médical initié par l'hopital _Beth Israel Deaconess_ à Boston. \n", "Initialement, seul des données de réanimation été accessible.\n", "\n", "Pour sa 4ème édition, a été mis à disposition un jeu de données couvrant un spectre bien plus large :\n", "- Données relatives aux passages aux urgences\n", "- Données relatives aux hospitalisations\n", "- Données relatives aux séjour en réanimation\n", "- Données de radiographie thoracique avec compte rendu associé" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "L'ensemble de ces données ont été mis à disposition dans le cadre de projets complémentaires :\n", "- [MIMIC-IV](https://physionet.org/content/mimiciv/0.4/) : hospitalisation et réanimation\n", "- [MIMIC-IV-ED](https://physionet.org/content/mimic-iv-ed/1.0/) : urgences\n", "- [MIMIC-IV-CXR](https://physionet.org/content/mimic-cxr/2.0.0/) : radiographie thoracique\n", "\n", "Ces bases sont complémentaires dans le sens où chaque collecte a été faite durant une période temporelle spécifique, qui se recoupe plus où moins. \n", "Certains éléments nécessaires à l'exploitation de MIMIC-IV-ED sont présent dans MIMIC-IV. \n", "La lecture de la documentation de MIMIC-IV et de MIMIC-IV-ED est vivement recommandé (lien ci-dessus).\n", "\n", "En complément, un certains nombre de ressources est disponible sur le site du projet [MIMIC-IV](https://mimic.mit.edu/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous proposons de travailler autours de ce jeu de données avec pour objectif la tache suivante : **prédire les examens biologiques qui seront réalisés lors de l'arrivé d'un patient aux urgences** \n", "Les métriques d'évaluation des performances seront :\n", "- L'**accuracy**\n", "- La **precision**\n", "- L'**aire sous la courbe (AUC)**\n", "\n", "Nous attachons une importance particulière à la précision. En effet, une sur-prescription d'examen biologique non indiqué pourrait entrainer un effet contraire à l'effet escompté, en prolongant le temps de prise en charge des personnes concernées." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## I.1 Téléchargement des données" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "La base de données de biologie étant volumineuse (nous y reviendrons plus bas), un pré-traitement des données a été effectué. \n", "Le pré-traitement est le suivant :\n", "- Intégration de l'ensemble des données utiles au sein d'une base de données SQLITE\n", "- Tri des lignes de biologies afin de ne conserver que celles répondant au critères suivants :\n", " - Date de réalisation >= date de début du passage aux urgences\n", " - Date de réalisation <= date de fin du passage aux urgences\n", "\n", "*Le script de transformation peut être consulté dans `database_constitution/database_constitution.py`*\n", "\n", "Un token de téléchargement des données vous a normallement été mis à disposition." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```\n", " # Commande à executer dans le terminal\n", " pip install -r requirements.txt\n", " python download_data.py [TOKEN]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## I.2 Récupération des données au format tabulaire" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les données sont extraites depuis la base de données SQLITE de la façon suivante :\n", " - Récupération et aggrégation des informations suivantes pour chaque consultation aux urgence identifié par un identifiant unique *stay_id* :\n", " - Date de passage *intime*\n", " - Genre *gender*\n", " - Age *age*\n", " - Température à l'accueil **temperature**\n", " - Fréquence Cardiaque à l'accueil **heartrate**\n", " - Fréquence respiratoire à l'accueil **resprate**\n", " - Saturation en Oxygène à l'accueil **o2sat**\n", " - Pression artérielle Systolique à l'accueil (**sbp**) et Diastolique (**dbp**)\n", " - Cotation de douleur à l'accueil **pain**\n", " - Motif de consultation à l'accueil **chiefcomplaint**\n", " - Consultation dans les 7 derniers jours **last_7** ou 30 derniers jours **last_30**\n", " - Antécédents connus au moment de la consultation selon la Classification Internationale des Maladies (CIM) : CIM-9 **icd9** ou CIM10 **icd10**\n", " - Traitements habituels du patient lors de sa consultation, Generic Sequence Number (GSN) (**gsn**)\n", " - Récupération des examens prescrits pour chaque consultation aux urgences :\n", " - Les examens ont été regroupés par paquet correspondant aux techniques de laboratoire et aux organes / aspects fonctionnels explorés sur base de connaissance métier.\n", " - Il s'agit pour chaque paquet d'examen d'une variables binaire indiquant si l'examen a été prescrit au moins une fois durant le passage aux urgences\n", " - La prescription est identifié par la présence d'un résultat d'examen dans la table de résultats d'examens biologiques **labevents**\n", " - Les différentes modalités d'examens sont :\n", " - Cardiaque (**Cardiaque**)\n", " - Coagulation (**Coagulation**)\n", " - Gazométrie (**Gazometrie**)\n", " - Glycemie Sanguine (**Glycemie_Sanguine**)\n", " - Hépato-biliaire (**Hepato-Biliaire**)\n", " - Ionogramme Complet (**IonoC**)\n", " - Lipase (**Lipase**)\n", " - Numération de Formule Sanguine (**NFS**)\n", " - Phospho-Calcique (**Phospho-Calcique**)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from scripts import preprocessing\n", "\n", "lab_dictionnary = pd.read_csv(\"./config/lab_items.csv\").set_index(\"item_id\")[\"3\"].to_dict()\n", "get_drugs, get_diseases = True, True\n", "\n", "X = preprocessing.generate_features_dataset(\n", " database=\"./data/mimic-iv.sqlite\",\n", " get_drugs=get_drugs,\n", " get_diseases=get_diseases\n", ")\n", "\n", "y = preprocessing.generate_labels_dataset(\n", " database=\"./data/mimic-iv.sqlite\",\n", " lab_dictionnary=lab_dictionnary,\n", ")\n", "\n", "# Par conception, last_7 et last_30 doivent valoir 0 lorsque manquant\n", "X[\"last_7\"].fillna(0)\n", "X[\"last_30\"].fillna(0)\n", "\n", "assert((X[\"stay_id\"] != y[\"stay_id\"]).sum() == 0) # Sanity check" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# II. Explorer et visualiser les données" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## II.1. Exploration des features" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "# Visualisation de la distribution des NA\n", "# Visualisation de la distribution des features selon les labels\n", "# Visualisation des corrélations\n", "# Visualisation du texte\n", "# Analyser des ATCD\n", "# Analyse des traitements" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## II. 2. Exploration des labels" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# Visualisation de la fréquence des labels" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# III. Sélection des variables d'intérêt" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "# Définir la stratégie de sélection des features (voir sklearn)\n", "# L'appliquer et sortir un dataset pertinent" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# IV. Définition et entrainement un algorithme d'apprentissage statistique" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# Prévoir la pipeline de traitement\n", "# Algorithmes à produire :\n", "# Tree classifier simple : argument, explicabilité\n", "# MLP\n", "# Ajouter les traitements et voir\n", "# Ajouter les ATCD et voir\n", "# Ajouter le texte et voir" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# OLD - History" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "from sklearn.impute import SimpleImputer, MissingIndicator, KNNImputer\n", "from sklearn.linear_model import LogisticRegression\n", "from sklearn.multioutput import MultiOutputClassifier\n", "from sklearn.neural_network import MLPClassifier\n", "from sklearn.compose import ColumnTransformer\n", "from sklearn.preprocessing import OrdinalEncoder, StandardScaler, PolynomialFeatures\n", "from sklearn.pipeline import Pipeline, FeatureUnion\n", "from sklearn.ensemble import RandomForestClassifier\n", "from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer\n", "from matplotlib import pyplot as plt" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "import torch" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "X[\"time\"] = pd.to_datetime(X[\"intime\"])\n", "X[\"time\"] = X[\"time\"].dt.hour*3600+X[\"time\"].dt.minute*60\n", "#X[\"temperature\"] = ((X[\"temperature\"]-32)*(5/9))-37 # On prend 37 comme norme\n", "#X[\"heartrate\"] = X[\"heartrate\"]-80\n", "#X[\"resprate\"] = X[\"resprate\"]-18\n", "#X[\"o2sat\"] = X[\"o2sat\"]-100\n", "#X[\"sbp\"] = X[\"sbp\"]-120\n", "#X[\"dbp\"] = X[\"dbp\"]-80" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "IonoC\n", "0 AxesSubplot(0.125,0.125;0.775x0.755)\n", "1 AxesSubplot(0.125,0.125;0.775x0.755)\n", "Name: age, dtype: object" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "pd.merge(\n", " X,\n", " y,\n", " left_on=\"stay_id\",\n", " right_on=\"stay_id\"\n", ").groupby('IonoC')[\"age\"].plot(kind='kde')" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "X.loc[(X[\"temperature\"] >= 107),\"temperature\"] = 107" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "X[\"temperature_norm\"] = (X[\"temperature\"]-X[\"temperature\"].mean())/(X[\"temperature\"].max()-X[\"temperature\"].min())\n", "ax = plt.subplot()\n", "pd.merge(\n", " X,\n", " y,\n", " left_on=\"stay_id\",\n", " right_on=\"stay_id\"\n", ").groupby('IonoC')[\"temperature_norm\"].plot(kind='kde', ax=ax)\n", "ax.set_xlim(-0.1, 0.1)\n", "ax.legend()" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
stay_idintimegenderagetemperatureheartrateresprateo2satsbpdbppainchiefcomplaintlast_7last_30icd9icd10gsntimetemperature_norm
0300000122126-02-14 20:22:00F6598.896.018.093.0160.054.00.0CHANGE IN MENTAL STATUS0.01.0[2761, 4589, 5712, 5990, 7804]NaN[8209, 21413, 2510, 27462, 66295, 6818, 6818, ...733200.007457
1300000172185-06-18 11:51:00M39NaN73.018.097.0156.0112.00.0ETOH, Unable to ambulate1.01.0[07070, 30300, V600]NaNNaN42660NaN
2300000382152-12-07 16:37:00F8097.154.018.095.0143.073.00.0CoughNaNNaN[9221, 92231, 92232, 95901, E8889]NaN[16925, 15864, 16278, 17037, 17037, 21413, 289...59820-0.008446
3300000392165-10-06 11:47:00M8098.685.016.098.0189.096.00.0s/p FallNaNNaNNaNNaN[8209, 19293]424200.005586
4300000552155-07-18 17:03:00F6399.485.016.0100.0NaNNaN0.0L Ear painNaNNaN[3804]NaN[8182]613800.013070
............................................................
448967399999392155-08-04 11:15:00M5798.484.016.097.0152.090.02.0Chest painNaNNaNNaNNaN[8182, 16578, 22159, 22159, 27475, 27475]405000.003715
448968399999532152-06-22 14:08:00F3698.2108.018.0100.0155.094.05.0Palpitations, Dizziness, HeadacheNaNNaNNaNNaN[6655]508800.001844
448969399999612145-05-16 17:16:00F5599.3119.022.0NaN132.074.07.0Chest pain, Cough, Dyspnea0.01.0[4019, 7295, 7820, 78909, E9208][C50919, H6692, I10, L308, L538, R112, R29810,...[16927, 16995, 16995, 13109, 2173, 2169, 18368...621600.012134
448970399999642130-06-05 11:53:00M5098.664.018.099.0127.064.04.0SI, Depression0.01.0[71946, 7242, 78659, 9221, 92231, 92411, 95901...[F329, H5711, M25561, M549, M79662, R45851, R5...[44633, 44633, 4561, 4561, 4561, 4540, 4540]427800.005586
448971399999652125-09-14 00:46:00F3097.565.016.0100.0132.077.00.0LaborNaNNaNNaNNaNNaN2760-0.004704
\n", "

448972 rows × 19 columns

\n", "
" ], "text/plain": [ " stay_id intime gender age temperature heartrate \\\n", "0 30000012 2126-02-14 20:22:00 F 65 98.8 96.0 \n", "1 30000017 2185-06-18 11:51:00 M 39 NaN 73.0 \n", "2 30000038 2152-12-07 16:37:00 F 80 97.1 54.0 \n", "3 30000039 2165-10-06 11:47:00 M 80 98.6 85.0 \n", "4 30000055 2155-07-18 17:03:00 F 63 99.4 85.0 \n", "... ... ... ... ... ... ... \n", "448967 39999939 2155-08-04 11:15:00 M 57 98.4 84.0 \n", "448968 39999953 2152-06-22 14:08:00 F 36 98.2 108.0 \n", "448969 39999961 2145-05-16 17:16:00 F 55 99.3 119.0 \n", "448970 39999964 2130-06-05 11:53:00 M 50 98.6 64.0 \n", "448971 39999965 2125-09-14 00:46:00 F 30 97.5 65.0 \n", "\n", " resprate o2sat sbp dbp pain \\\n", "0 18.0 93.0 160.0 54.0 0.0 \n", "1 18.0 97.0 156.0 112.0 0.0 \n", "2 18.0 95.0 143.0 73.0 0.0 \n", "3 16.0 98.0 189.0 96.0 0.0 \n", "4 16.0 100.0 NaN NaN 0.0 \n", "... ... ... ... ... ... \n", "448967 16.0 97.0 152.0 90.0 2.0 \n", "448968 18.0 100.0 155.0 94.0 5.0 \n", "448969 22.0 NaN 132.0 74.0 7.0 \n", "448970 18.0 99.0 127.0 64.0 4.0 \n", "448971 16.0 100.0 132.0 77.0 0.0 \n", "\n", " chiefcomplaint last_7 last_30 \\\n", "0 CHANGE IN MENTAL STATUS 0.0 1.0 \n", "1 ETOH, Unable to ambulate 1.0 1.0 \n", "2 Cough NaN NaN \n", "3 s/p Fall NaN NaN \n", "4 L Ear pain NaN NaN \n", "... ... ... ... \n", "448967 Chest pain NaN NaN \n", "448968 Palpitations, Dizziness, Headache NaN NaN \n", "448969 Chest pain, Cough, Dyspnea 0.0 1.0 \n", "448970 SI, Depression 0.0 1.0 \n", "448971 Labor NaN NaN \n", "\n", " icd9 \\\n", "0 [2761, 4589, 5712, 5990, 7804] \n", "1 [07070, 30300, V600] \n", "2 [9221, 92231, 92232, 95901, E8889] \n", "3 NaN \n", "4 [3804] \n", "... ... \n", "448967 NaN \n", "448968 NaN \n", "448969 [4019, 7295, 7820, 78909, E9208] \n", "448970 [71946, 7242, 78659, 9221, 92231, 92411, 95901... \n", "448971 NaN \n", "\n", " icd10 \\\n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "... ... \n", "448967 NaN \n", "448968 NaN \n", "448969 [C50919, H6692, I10, L308, L538, R112, R29810,... \n", "448970 [F329, H5711, M25561, M549, M79662, R45851, R5... \n", "448971 NaN \n", "\n", " gsn time \\\n", "0 [8209, 21413, 2510, 27462, 66295, 6818, 6818, ... 73320 \n", "1 NaN 42660 \n", "2 [16925, 15864, 16278, 17037, 17037, 21413, 289... 59820 \n", "3 [8209, 19293] 42420 \n", "4 [8182] 61380 \n", "... ... ... \n", "448967 [8182, 16578, 22159, 22159, 27475, 27475] 40500 \n", "448968 [6655] 50880 \n", "448969 [16927, 16995, 16995, 13109, 2173, 2169, 18368... 62160 \n", "448970 [44633, 44633, 4561, 4561, 4561, 4540, 4540] 42780 \n", "448971 NaN 2760 \n", "\n", " temperature_norm \n", "0 0.007457 \n", "1 NaN \n", "2 -0.008446 \n", "3 0.005586 \n", "4 0.013070 \n", "... ... \n", "448967 0.003715 \n", "448968 0.001844 \n", "448969 0.012134 \n", "448970 0.005586 \n", "448971 -0.004704 \n", "\n", "[448972 rows x 19 columns]" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY4AAAD4CAYAAAD7CAEUAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAA+mUlEQVR4nO3dd3hU17nv8e87o95RAYQkQEiid0SxKQYcm+KCnRw7dhKX2AkhttNzEuIk5zj3nji+6cU+7iV2EjtOXMA2bnFsAza9GCRhQFQ1UAE11GfW/WOPQIiRGIFGeyS9n+eZRzO7DD/xaPRqr7X2WmKMQSmllPKVw+4ASimlehctHEoppbpEC4dSSqku0cKhlFKqS7RwKKWU6pIguwP0hMTERDN8+HC7YyilVK+ybdu2cmNMUvvt/aJwDB8+nK1bt9odQymlehUROeJtuzZVKaWU6hItHEoppbpEC4dSSqku6Rd9HEopZYfm5mYKCwtpaGiwO0qnwsLCSE1NJTg42KfjtXAopZSfFBYWEh0dzfDhwxERu+N4ZYyhoqKCwsJC0tPTfTpHm6qUUspPGhoaSEhICNiiASAiJCQkdOmqSAuHUkr5USAXjVZdzaiFQ6kA1tDs4rkNh6lvctkdRanTtHAoFcDeyTvOT1fl8vW/bqOpxW13HNVLvfXWW4waNYrMzEweeOCBi34/LRxKBbDc4ipE4IO9ZXzvH5/gcuvCa6prXC4Xd999N2+++SZ5eXk8//zz5OXlXdR7auFQKoDlFVczNjmGHy4ezWufFPPfq3PQVTtVV2zevJnMzExGjBhBSEgIN910E6tWrbqo99ThuEoFKGMMecXVLBw9kK/Pz6CyvolHPzzIgIgQvnflKLvjqS762Wu55BVXd+t7jh0Sw39fM67TY4qKikhLSzv9OjU1lU2bNl3Uv6uFQ6kAVVbTSMWpJsYOiQFg5eLRVNU186d/5xMbHsxX5o6wOaHqDbxdoV7sSC8tHEoFqNwS66/TcUNiAevD/vPrJ1BV38z/vLGHReMGkxYf0aX3PFBWS+HJei4bec5M2crPzndl4C+pqakUFBScfl1YWMiQIUMu6j21j0OpANXarDE6Ofr0NqdDuGt+JgC7i6q6/J73v7GHe/66XftJ+pHp06ezf/9+Dh06RFNTEy+88ALXXnvtRb2nXwuHiCwWkb0iki8iK73sFxH5o2f/LhGZ6tmeJiLvi8geEckVkW+1Oec+ESkSkZ2ex1J/fg9K2SWvuJqh8RHEhJ09f1DWoCicDulye3mzy83GgxXUNLZQWtPYnVFVAAsKCuLBBx9k0aJFjBkzhhtvvJFx4y7u6sdvTVUi4gQeAq4ACoEtIrLaGNN2HNgSIMvzmAk87PnaAnzPGLNdRKKBbSLybptzf2eM+bW/sisVCPJKrBFV7YUFO8lMiiKvpGuF45OCSk55biQ8UFrLoJiwbsmpAt/SpUtZurT7/sb25xXHDCDfGHPQGNMEvAAsa3fMMuBZY9kIxIlIsjGmxBizHcAYUwPsAVL8mFWpgFLb2MLhilOnO8bbGzskpstXHOvzy08/P1BWe1H5VP/mz8KRAhS0eV3Iub/8z3uMiAwHpgBtx4/d42naekpEBnRbYqUCxN5j1RiD1ysOsLYfq26gotb3JqeP8suZkBJLZIiTA2Wnuiuq6of8WTi8jfdq3yPX6TEiEgW8BHzbGNP659XDQAYwGSgBfuP1HxdZLiJbRWRrWVlZF6MrZa9cz9XEuJSOrzgA9pTU+PR+tY0t7DhayZysRDIGRukVh7oo/iwchUBam9epQLGvx4hIMFbR+Ksx5uXWA4wxx40xLmOMG3gcq0nsHMaYx4wx2caY7KQkHXqoepe84moGRAQzuIN+iDGeK5G8Et9GVm0+VEGL2zA3M5GMpCgO6hWHugj+LBxbgCwRSReREOAmYHW7Y1YDt3pGV80CqowxJWLdnfIksMcY89u2J4hIcpuX1wM5/vsWlLJHXkk1Y4fEdHijVnxkCMmxYT73c6zfX0FokIOpwwYwIjGSosp66ppaujOy6kf8VjiMMS3APcDbWJ3bLxpjckVkhYis8By2BjgI5GNdPdzl2T4buAVY6GXY7S9FZLeI7AIWAN/x1/eglB1aXG4+PVbTYf9Gq7HJMT6PrFqfX8aM9HjCgp1kDIwC0KsOdcH8eue4MWYNVnFou+2RNs8NcLeX89bjvf8DY8wt3RxTqYBysPwUTS3uDkdUtRo7JIYP9pXR0OwiLNjZ4XGl1Q3sO17LZ6emApCRZBWOA2W1jE+J7b7gKmDdcccdvP766wwcOJCcnItvpNE7x5UKMK3NT2OTO/+lPjY5BpfbsO945x3kHx2whuHOyUwEYFhCBA5BR1b1I7fffjtvvfVWt72fFg6lAkxucRUhQQ4ykiI7Pa71iuR8/Rzr91cQFxF8uukrLNhJWnwEB3VkVb8xb9484uPju+39dJJDpQJMXkk1owdHE+Ts/O+6tAERRIUGddrPYYzho/xyZmck4nCcaf3NSIrSK46e9uZKOLa7e99z8ARYcvEr+nWVXnEoFUBa1+Dw2jFets96eDgcwpjk6E6vOA6UneJYdQNzshLP2j4iMZKDZbW4dUVBdQH0ikOpAHKsuoGTdc3eO8ZfugMQWLHu9KaxyTH8c1shbrc564qi1Uf5Z/dvtMoYGEVji5uiyvouT82uLpANVwb+olccSgWQMx3j7QpH3QmrmePYbqg/eXrz2CExnGpycfREndf3W59fztD4iHOKQ9uRVUp1lRYOpQJIXnE1IjC6feE48pHniYGjZ6Ztax155a2fo8XlZuOBCma3u9oATne8670c/cPNN9/MJZdcwt69e0lNTeXJJ5+8qPfTpiqlAkhucTXDEyKJCm330Ty8HoLCwbjgyHoYtRg4e22OpROSzzplV1EVNY0t5zRTgXXneVxEsF5x9BPPP/98t76fXnEo1QM+zi/nWFXDeY/raA0ODq+HoTMhJRuOfHx6c2drc7z+SQkicElGwjn7RMQzskoLh+o6LRxK+VlpTQNfeGITi36/lrdySjo8rqqumaMn6s7tGK87AcdzYPgcGHYpFO+ExjM3/Xlbm2PL4RM88/EhPp+dRnxkiNd/b0RipA7JVRdEC4dSfpbjWRs8MsTJir9s50cv76besxIfQEOzi2c+OsSVv/8QgJnp7W7Uau3fGD4Xhs+2mqsKNp/e3X5tjpqGZr7z952kDojgJ1eP7TBXxsAoymoaqapv7o5vU3WgN6zv3tWMWjiU8rOcIqvDe8235rLisgxe2HKUq/+0jp0FlTy74TDzf/UB972Wx7CESJ7/6iyyh7crHK39G0OmQuoMEOdZzVXt1+b42Wt5FFfW87vPTz63r6SN1pFVege5/4SFhVFRURHQxcMYQ0VFBWFhvi8lrJ3jSvnZ7qIq0hMjiYsIYeWS0czNSuQ7f9/JdQ9ZVxLThw/gtzdO4pKMBO/TqB9aZ/VvBIVYjyGT24yyOnttjtrGZv65rZBvLMxk2rDOF8dsO7JqylBdSNMfUlNTKSwsJNAXkwsLCyM1NdXn47VwKOVnuUVVZ11FzM5M5K1vz+PxdQeZk5nIpR0VDIBTFVCaC+N/cmbbsNmw6RForofg8NNrc3y4r4y84mompMTyzcuzzpsrLT6CYKdoB7kfBQcHk56ebneMbqdNVUr5UUVtI8VVDUxoN315fGQIP1w8mtmZiR0XDWjTvzHvzLZhs8HVBIVbT28amxzDR/kV1De7+N3nJxN8nnmuAIKdDoYlRGrhUF2mhUMpP8o5z9rh53V4PQRHwJApZ7YNnQWI136Oe5eOIdOzUNM5qorgla9Dw5nlZnVklboQ2lSllB+1jqgaN+QCF0w6vB7SPP0brcLjYPD4s/o5vjRrGMmx4dw8I63j99r0CHzyN6vwTLsNsEZWvb+3lGaX26erFKVArziU8qucoiqGJUQQGx7c9ZNb+zeGzzl337A51pDcliYABsWE8YWZQztu9nI1wyeeu4f3vHZ6c0ZSFM0uQ0EHc10p5Y0WDqX8aHdR1YUvz9r2/o32hl0KLfVQstO399r/Dpwqg0Hj4dCHp5urdM4qdSG0cCjlJ5V1TRSerGf8BTdTrTu3f6PVsEs9x6z37b12/AWiBsGSX1od6/vfBWCEzpKrLoAWDqX8JNfTMd5+RJXPvPVvtIpMhKTRZ3WQd6jmOOx7GybdDEMvsQrIntUAxIYHkxQdqoVDdYkWDqX8ZPfpjvELGFF1qhxK8yDdSzNVq2GXwtGN4HZ1fAxYfRvGBVO+BA4HjL7auuJorgeskVX5pVo4lO+0cCjlJzlFVaTEhTOgg0kGO9VZ/0arYbOhqQaO7er4GGOsZqq0WZDouSlwzDXQXAcH/g1A5kBr/fFAnhZDBRYtHEr5SU5R1cU1U3XUv9GqtZ+js+aqgs1QsR+m3nJm2/A5EBZ3enRV5sAoquqbKa9turCsqt/RwqGUH1Q3NHO4oo7xF3Pj39BZ4OxkGG/MEBiQDoc/6viYHc9BcCSMve7MNmcwjFoCe9eAq/n0DYP7S2u8v4dS7WjhUMoPcousjvELGorb2r/h7f6N9obPhqMfg9t97r7GWsh9BcZfD6Ht7iYfc401JPfwerIGRgNwQPs5lI+0cCjlB7nFVsf4BRWO3FesrxkLz3/ssNlQfxI2/u+5xSPvVWiqhSm3nntexkKrKWzPawyKCSUqNEg7yJXPtHAo5Qe7i6pIjg0jMSq0aycaA5sft/o2kief//gx18KIBfDOj+GpRXA878y+HX+BhCxIm3HuecHhkHUFfPo6YgwZA6PYr4VD+UgLh1J+kFNUdWHzUx1aC+V7YfpXobNZc1uFRsEtr8D1j8KJA/DoXHjv/8Cx3XB0gzUEt6P3GX0N1B6Hwi1kJkXpFYfymV8Lh4gsFpG9IpIvIiu97BcR+aNn/y4RmerZniYi74vIHhHJFZFvtTknXkTeFZH9nq+6Ao0KKLWNLRwsP3VhI6q2PA7h8TD+s76fIwKTboK7t8CEG2Hdb+DxhdZKgZNu7vi8kVeCIxj2rCZrUBSlNY1UN+gysur8/FY4RMQJPAQsAcYCN4tI+wWQlwBZnsdy4GHP9hbge8aYMcAs4O42564E3jPGZAHveV4rFTD2lFRjDF0fUVVZAJ++AVNvtZqSuioyAa5/GG5dBQOGw8QbIXpQx8eHxcKI+bDnNTITrTmr9KpD+cKfVxwzgHxjzEFjTBPwArCs3THLgGeNZSMQJyLJxpgSY8x2AGNMDbAHSGlzzp89z/8MXOfH70GpLttdaHWMd/mKY9vT1tfsOy4uwIj5cM8WuO7h8x7KmGug8ghjHUcAyD+uhUOdnz8LRwpQ0OZ1IWd++ft8jIgMB6YAmzybBhljSgA8Xwd6+8dFZLmIbBWRrYG+3q/qW3KKq0iKDmVgTJjvJzU3wLZnYOQSGDCse4L40kcyaimIg8HF7xIS5CBf56xSPvBn4fD2U9t+ToNOjxGRKOAl4NvGmOqu/OPGmMeMMdnGmOykpKSunKrURbmgO8bzXoW6CpjxVb9k6lBUEqRk4zj0gc5ZpXzmz8JRCLRdjiwVKPb1GBEJxioafzXGvNzmmOMikuw5Jhko7ebcSl2wqrpm8ktrmZjaxcKx+XFr6OyI+X7J1am0GVCyi9GJIVo4lE/8WTi2AFkiki4iIcBNwOp2x6wGbvWMrpoFVBljSsRaxuxJYI8x5rdezrnN8/w2YJX/vgWluubjA+W4DczJTDx7R2MtvHWv93mlirZB0VaYsdy35qXuljodXI3MjCyh4GQdDc3nmW1X9Xt+KxzGmBbgHuBtrM7tF40xuSKyQkRWeA5bAxwE8oHHgbs822cDtwALRWSn57HUs+8B4AoR2Q9c4XmtVEBYu7+cqNAgJqXFnb1j29Ow8SF4egn843aoPHpm3+YnICTKGlJrh9TpAEww+zBGF3VS5xfkzzc3xqzBKg5ttz3S5rkB7vZy3nq8939gjKkALu/epEpdPGMM6/aXcUlGAsHONn+TuVpg4yPWIkoj5sP638PeN+HSb1hDb3NesmavDbvACREvVmwKRA9haF0uMJ780toLu3lR9Rt657hS3eRIRR2FJ+uZm9WumWrPKqguhNnfgvkr4RtbrcWU1v4K/jQNXI3WneJ2Ss0munwnDtF7OdT5aeFQqpusyy8HYG5Wm1F8xsDHD0J8BmQtsrbFpsJ/PAl3vGPNSTXuszBwtA2J20idjlQeYfKAZi0c6rz82lSlVH+ybl8ZKXHhDE+IOLPx6EYo3g5X/cZatrWtoTPhznd6NmRHPP0cC6KOsrpUZ/FRndMrDqW6QYvLzYYDFcwbmYi0HRm14UEIH9D5nFGBIHkSOILIDsrncMUpml1e1vdQykMLh1Ld4JPCSmoaW5iT2aaZ6sRBa+6p7DsgJNK+cL4IiYBB48lo3EOzy3Ckos7uRCqAaeFQqhus21+OCMzOTDizceMj4Aiyv+PbV6nTSajKwYFb+zlUp7RwKNUN1u0vZ2JKLHERIdaG+pPWQkoTboCYZHvD+Sp1Os6WOkZKod7LoTqlhUOpi1Td0MzOgsqzR1Nt+zM0n4JL7ur4xECTmg3AgsjDesWhOqWFQ6mLtOFABS63YU7r/RuuZtj0KKRfBoMn2BuuK+JHQHg8s0IPsb+0xu40KoBp4VDqIq3fX05EiJOpQz3DWHNfhZpiuOQeW3N1mQikTmesay8HSk/hdrefzFopixYOpS7Suv1lzBqRQEiQ5+O07WlIyITMz9gb7EKkTiep4TDBzdUUV9XbnUYFKC0cSl2EghN1HK6oOzPNSM1xawbcCTece8Nfb+Dp55jkOMB+7edQHeiFP9lKBY51+1unGfEUjk9fAwyMuda+UBcjZSoGYYrkc0ALh+qATjmi1EVYn19GcmwYGUlR1oa8VdaCTAPH2BvsQoXFIkmjmVF2gN/nHKO6vpmKU02c8Dw+Ny2VG7PTzv8+qk/TKw6lLoAxhifWHeTNnGNcPmagNc3IqXI4/BGMXWbPgkzdJTWbKY58th45wZ/ez+fNnGPsL63lUPkpfv32Xlp0OpJ+T684lOqiphY3P3l1Ny9uLWTJ+MHcu9RzdfHpG2BcVuHozVKnE7njOXbcNYKY1DE4HVYRfCf3GMuf28aH+8q4fMwgm0MqO+kVh1JdcOJUE196chMvbi3kGwszeegLU4kI8fz9lbcKBqT3rns3vPHMlDvgxCeniwbAgtEDSYwK5e9bCuxKpgKEFg6lfLT/eA3LHlrPzoJK/nDTZL535Sgcrb9Y607AoQ97fzMVQNIoCImGwi1nbQ52Ovjc1BT+/WkpZTWNNoVTgUALh1I++v4/PqG+ycXfl89i2eSUs3fufRPcLb2/mQrA4YSUqVC09ZxdN2Sn0eI2vLKj0IZgKlBo4VDKBw3NLnKLq/n89DSmDPWy0FHeKogdaq3o1xekTodjOZD7ijU9vNvqEM8cGMW0YQP4+5YCjNE7y/sr7RxXygf7jtfQ4jaMGxJ77s6GKjjwb5j5td7fTNUq60r4+E/wj9ut16ExMGg8DJ7A8pGL+Nq7DWw/Wsm0YbpaYH+khUMpH+QWVwMw3lvh2Pc2uJv7RjNVq6EzYeVRKNsDJbvg2G44tgt2PMcVoasYHPJzXtxSoIWjn9LCoZQPcoqqiA4LIi0+/NydeasgegikZPd8MH8KDrOa3to2vxXvwPHEZ3howN+5dded/Nc1Y4kM1V8j/Y32cSjlg9ziasYmx5y9njhAYw3sfxfGXts756bqqiFTYN5/Mq3qHea0bOCN3SV2J1I26Ac/6UpdnBaXmz0l1YxP8dJMtf8dcDX2rWaq85n7PUzyJB4IfZq3Nu22O42ygRYOpc7jYPkpGlvcjBsSc+7OvFUQNQjSZvZ8MLs4g5HrHyWGOm449lsO6KJP/Y4WDqXOI7e4CuDcK46mU1Yz1eirrXsf+pOBY6ifs5Ilzi1sePURHZrbz2jhUOo8coqqCQ1yMCIx8uwdG/4Xmutg0k32BLNZ1ILvUBw9gWuKfsvz722yO47qQX4tHCKyWET2iki+iKz0sl9E5I+e/btEZGqbfU+JSKmI5LQ75z4RKRKRnZ7HUn9+D0rlFlcxOjmGIGebj0t1Caz/nXW1kTbDvnB2cjgZfOszhDtcDF37Pbasfwea6uxOpXqAT4VDRF4SkatExOdCIyJO4CFgCTAWuFlExrY7bAmQ5XksBx5us+8ZYHEHb/87Y8xkz2ONr5mU6ipjDLnF1Yxv37/x7/8Lria48v/aEyxAOJIyYdH9zHHkMP1fN2B+kQJ/yrZuHFz7a6gtszui8gNfC8HDwBeA/SLygIiM9uGcGUC+MeagMaYJeAFoP/RkGfCssWwE4kQkGcAYsxY44WM+pfyi4EQ9NQ0tZ98xXrwTdv4NZq2A+BG2ZQsUIbO+QtkdW/hB0A94ynEDjQMyoWi7VVxX3W13POUHPhUOY8y/jDFfBKYCh4F3ReRjEfmyiAR3cFoK0Hb+5ULPtq4e4809nqatp0REb11VfpNzumPcc8VhDLx9L0QkwLz/tDFZYEkaOpJbbr+HXzVdz83V36Dh7h1wyT3WVCz1lXbHU92sK01PCcDtwFeAHcAfsArJux2d4mVb+6EXvhzT3sNABjAZKAF+00He5SKyVUS2lpXp5bK6MLnFVTgdwshB0daGPa/BkY9gwb0Q5uW+jn5sQmosv//8ZLYfreThDw7AuOutqVj2amtyX+NrH8fLwDogArjGGHOtMebvxphvAFEdnFYItF2cOBUovoBjzmKMOW6McRlj3MDjWE1i3o57zBiTbYzJTkpK6uwtlepQbnE1WQOjCAt2QksjvPtTSBoDU2+zO1pAWjw+mblZiby0vRAzZCrEpkHuq3bHUt3M1yuOJ4wxY40xvzDGlACISCiAMaajCXq2AFkiki4iIcBNwOp2x6wGbvWMrpoFVLW+f0da+0A8rgdyOjpWqYuVU1R9pn9j0yNw8jAsvh+cOj9TR66fkkLhyXq2Hq207qjX5qo+x9fC8T9etm3o7ARjTAtwD/A2sAd40RiTKyIrRGSF57A1wEEgH+vq4a7W80Xkec+/MUpECkXkTs+uX4rIbhHZBSwAvuPj96BUl5RWN1Be22jdMV5bZo0SyloEGQvtjhbQFo0bTHiwk5e3F8HY6zzNVW/aHUt1o07/bBKRwVid1eEiMoUzfRIxWM1WnfIMlV3TbtsjbZ4bwOuwC2PMzR1sv+V8/65S3SGn7R3j639t3ex3pbe/oVRbkaFBLBo3iDd2FfPfV19OWEwq5L0Kk71+pFUvdL7r7UVYHeKpwG/bbK8B7vVTJqUCQm6RtQbHmORoWL0GMq+ApJE2p+odrp+ayqs7i/lgXxmLxy6DLY9bC17pgII+odOmKmPMn40xC4DbjTEL2jyuNca83EMZlbJFTnEV6YmRRDeUwMlDMOIyuyP1GrMzEkiKDrWaq8ZdZ90sqc1VfUanhUNEvuR5OlxEvtv+0QP5lLJNbnE1Y4fEwMEPrQ3pWjh8FeR0sGzSEN7fW8rJARMhJkVHV/Uh5+scb53VLQqI9vJQqk+qqmum8GS91TF+6EOIHAgDx9gdq1e5fmoKzS7D6znHPaOr3rOaq1Sv12kfhzHmUc/Xn/VMHKUCw+mp1JNjYOtaSJ8H7Vf/U50amxzDyEFRvLqjiFuuug42/i/sfQsmfd7uaOoi+XoD4C9FJEZEgkXkPREpb9OMpVSfk1tsdYxPDC2B2uMwYr69gXohEeH6KalsO3KSIxFjrXXZ8161O5bqBr7ex3GlMaYauBrrbu+RgE7Uo/qsnOIqkmPDiDvmuV1JO8YvyHVThiACr+wssZqr8t+Dhmq7Y6mL5GvhaJ3IcCnwvDFGZ61Vfdr2oyeZmBpr9W8MSIe4oXZH6pWSY8O5ZEQCr+wowoxdZq3Pvu8tu2Opi+Rr4XhNRD4FsoH3RCQJaPBfLKXsU1JVT8GJemYOi4XD6/Vq4yJdPyWFIxV1bDcjreYqHV3V6/k6rfpK4BIg2xjTDJzi3LU1lOoTNh+yLqjnRRVCY7UOw71Ii8cPxiHw4f4KT3PVv6Cxxu5Y6iJ0ZenYMcDnReRW4D+AK/0TSSl7bTl8gsgQJ+nVW6wN6fPsDdTLRYcFM3JQNLsKK2H0Uqu56vB6u2Opi+DrqKrngF8Dc4DpnkdHs+Iq1attPnSCacPjcR5eC4MmQGSi3ZF6vYmpsXxSUIlJnQHBEVYnueq1fJ0bOhsY65mUUKk+6+SpJvYdr+X68fGwcTPM+KrdkfqESWlxvLi1kIJqN0OHz7GmWle9lq9NVTnAYH8GUSoQbD1yEoD54QetJhXt3+gWk1LjAPiksNKalv7EAWttE9Ur+Vo4EoE8EXlbRFa3PvwZTCk7bD5UQYjTQVbddnAEwbBL7Y7UJ4waHE1okINPCioh43Jr44H3bc2kLpyvTVX3+TOEUoFi8+GTTEqLJejwWkjJhtCOVkZWXRHsdDBuSIx1xXHVJRCTas1dlf1lu6OpC+DrcNwPgcNAsOf5FmC7H3Mp1eNONbaQU1TF3LRgKNmp9290s0lpceQUVdPiNpCxAA6uBVeL3bHUBfB1VNVXgX8Cj3o2pQCv+imTUrbYcbQSl9uwMGwfGLfOT9XNJqXGUd/sYn9pLWReDo1VUKx/f/ZGvvZx3A3MBqoBjDH7gYH+CqWUHTYfPoFDYOSp7daQ0RQdcd6dJqXFAVj9HOmXAaKjq3opXwtHozGmqfWFiAQBOjRX9SmbD1UwdkgMIUfXWZ3iQSF2R+pThidEEBMWZPVzRMRDylS9n6OX8rVwfCgi9wLhInIF8A/gNf/FUqpnNbW42XG0kqWDTkL5Xh2G6wciwqS0OD4p8CzmlHE5FG2F+kpbc6mu87VwrATKgN3A14A1wE/8FUqpnra7qIqWlma+UPIrCI+HSTfbHalPmpQax97jNdQ3uaz7OYwbDq21O5bqIp+G4xpj3CLyKvCqMabMv5GU6nmbD53gK841xJ3cBf/xFEQl2R2pT5qUFofLbcgtriI7LRtCoq1+jrHX2h1NdUGnVxxiuU9EyoFPgb0iUiYi/9Uz8ZTqGYX7dvC94H/CmGtg3GftjtNnTUqNBeCTwipwBlsTSB54D3Q2o17lfE1V38YaTTXdGJNgjIkHZgKzReQ7/g6nVE9wtbRwY/EDtDjD4arf6trifjQwJozk2DBrZBVA5kKoPAonDtqaS3XN+QrHrcDNxphDrRuMMQeBL3n2KdXrlf3rd0xiPzmTfgxROsrc3yalxlkjq8Dq5wAdltvLnK9wBBtjyttv9PRzBHs5XqnepXw/iZt/xTuuaSTPvsXuNP3CpLQ4jlTUUVnXBPEjYMBwHZbby5yvcDRd4D6lAp/bhVl1N/UmmAcj7iI1PsLuRP3CWf0cYA3LPbwOWvRXSm9xvsIxSUSqvTxqgAk9EVApv9n6FFKwif9qvIXPL5yOaN9GjxifGosIZ/o5MhZCUy0UbrE1l/Jdp4XDGOM0xsR4eUQbY87bVCUii0Vkr4jki8hKL/tFRP7o2b9LRKa22feUiJSKSE67c+JF5F0R2e/5OqAr37BSrVzbnyNXMtk/eCk3TR9qd5x+IyYsmIykKGspWYD0uSBOyHvVzliqC7qy5niXiIgTeAhYAowFbhaRse0OWwJkeR7LgYfb7HsGWOzlrVcC7xljsoD3PK+V6prqYpzHPuH1pmx+tmw8TodebfSkiamx7CyowhgDYbEw5Yuw5Qk4ssHuaMoHfiscwAwg3xhz0DPP1QvAsnbHLAOeNZaNQJyIJAMYY9YCJ7y87zLgz57nfwau80d41beVb/esQzZqMdOGxdsbph+anBZHeW0jxVUN1oZF90PcUHhlOTRU2RtOnZc/C0cKUNDmdaFnW1ePaW+QMaYEwPPV6/hJEVkuIltFZGtZmd7srs5WtOllCsxAvnydt4ta5W+tS8lu8yzVS2g0fPZxqCqEN39oXzDlE38WDm/X/u1vD/XlmAtijHnMGJNtjMlOStLpI9QZH+QcZlTddqrSLmdgTLjdcfql8SmxJESG8K+842c2ps2Aef8JnzwPOS97P7GxFj76IxzP7Zmgyit/Fo5CIK3N61Sg+AKOae94a3OW52vpReZU/Uhji4t3Xvs7YdLMqHk32h2n33I6hM+MGcT7n5bS1OI+s2Pef1rroLz+HagqOvukfe/A/14C7/4U/v3zng2szuLPwrEFyBKRdBEJAW4CVrc7ZjVwq2d01SygqrUZqhOrgds8z28DVnVnaNW3/WXjUSac2kBLcBTBI+bYHadfWzR+EDWNLXx8oM09xs5g+Oxj4GqGV78ObjfUHId/3A5/uwGCwyHzM9ad5k11tmXv7/xWOIwxLcA9wNvAHuBFY0yuiKwQkRWew9YAB4F84HHgrtbzReR5YAMwSkQKReROz64HgCtEZD9whee1Uj5Zv+84VwbvJCjrM7pQk80uzUgkMsTJ27nHz96RkAGLfwGHPoSX7oCHpsOnb8CCH8OKdXDpN6ClHg6+b09w5du06hfKGLMGqzi03fZIm+cGa1lab+d6XRDBGFMBXN6NMVU/YYyhpXAHCeYkjFpid5x+LyzYyfxRA3k37zg/v248jrZDoqfeCvvehtxXYPhcuPr3kJhp7Rs22xrC++kbMPoqW7L3d34tHEoFkqLKerKbNuEOcuDIutLuOAq4ctwg3thdwo6Ck2cPixaBzz0OhVutqdfb3tXvDIasRbD3TXC1gFN/jfU0f/ZxKBVQdhVW8RnHduoGTbPWvFa2WzB6IMFOObe5CiAkEkZc5n2a+9FXQf0JKNjk/5DqHFo4VL9x6MA+xjmOEDZOmzcCRUxYMJdmJPJ27jHrLnJfZV4OzlCruUr1OC0cqt8IP/QOAEFjltqcRLV15bhBHKmoY+/xGt9PCo22rkY+fV1XD7SBFg7VL7jdhqzKj6gISYHEkXbHUW1cMXYQIvCOt+aqzoy+CiqPQGmef4KpDmnhUP3CkWNlzCCHipSFujRsgBkYHcbUoQN4O/dY104cuQQQba6ygRYO1S+U7nyTUGkmVPs3AtKicYPILa6m4EQXbuqLHgSp063mKtWjtHCofiHkwDtUmwhSJi60O4ry4sqxgwF4J+8CmqtKPrEmR1Q9RguH6vvcboaf/IjdYdkEhYTanUZ5MTwxklGDonmnq81VrTcAfrqm8+NUt9LCofq8lpLdDHCfpDT5MrujqE4sGjeILYdPdK25KjHLGuywV/s5epIWDtXnledacxpFZGnhCGTLpqQQEuRg6R/W8ddNR3C7fRxmO/oqOLwe6k/6N6A6TQuH6vOaD66n0CSSNXKM3VFUJzKSonjzW/MYnxLLj1/J4abHNpJfWnvWMaU1Dbyde4yXtxeeuWFw1FXgboH979qQun/SSV5U32YMA8q38B4TuSYh0u406jzSEyP521dn8o9thfz8jT0s/cM6vjBzKBWnmthx9CSFJ+tPHztqcDTjhsRCyjSIGmQNy52oa6z0BL3iUH1b+T6iWio5PmDa2bOvqoAlItyYnca/vnsZV44bxDMfH2br4RNMTI3lJ1eN4anbswFYu8+zjofDAaOWQv6/oKHaxuT9h15xqD6t+eB6ggH3sEvtjqK6KCk6lAe/MJVf/kcLESFn/6oaPTiatfvK+Pr8DGvD1Fth29Ow+TGY930b0vYvesWh+rTavR9w3MQxNGO83VHUBWpfNADmjUxi65ETnGpssTakTIWRi+HjP+lVRw/QwqH6LmMIKdrIZvdoJqbF2Z1GdaN5WUk0uwwbD1ac2Th/JTRUwuZHbcvVX2jhUH3XyUNENpayO2gCKXHhdqdR3Sh7+ADCgh2s3Vd2ZuOQKdb8VR8/qFcdfqaFQ/VdRz4GoHbwDEQnNuxTwoKdzBqRwNr95WfvmP9D66pjk151+JMWDtVntRxaT4WJJjF9kt1RlB/My0riUPmps+80b73q2PAgNFTZF66P08Kh+qyWg+vZ4h7NxNQ4u6MoP5g3MgmAD9s2V8GZvo5Nj/V8qH5CC4fqm6oKCast8HSMx9qdRvlBRlIkKXHhZ/dzAAyZbN3XseFPetXhJ1o4VN90ZAMAxXFTGRgdZnMY5Q8iwryRiXx8oIJml/vsnZf90Coa2tfhF1o4VJ/UdHAt1SaC4WNn2B1F+dG8rCRqG1vYcbTy7B2nrzq0r8MftHCoPqnpwHq2uEexYEyy3VGUH12amYjTIazbX3buzvkrraLx/M1Qsqvnw/VhWjhU31NbSlTNQXY5xzFt2AC70yg/ig0PZnJa3Ln9HADJk+CaP0JpHjw6D169C6qLez5kH6SFQ/U5rsPW/Rtm2CUEOfVHvK+bl5XErqIqTpxqOnfntNvgmzvg0ntg9z/gT9Pg/fuhsfbcY5XP9FOl+pyKvPepM6FkTppjdxTVA+aNTMQYWJ9f7v2A8AFw5f/A3Zth5CL48P/Bg9Ph6MaeDdqH+LVwiMhiEdkrIvkistLLfhGRP3r27xKRqec7V0TuE5EiEdnpeSz15/egeqHD69luspg3eojdSVQPmJgaR1xEsPfmqrbi0+GGZ+COdyAoFJ65CjY8BMbHlQbVaX4rHCLiBB4ClgBjgZtFZGy7w5YAWZ7HcuBhH8/9nTFmsuehq9SrM+pOkFh3gMKYqcRFhNidRvUAp0OYnZnI2n1l5w7L9WboTPjah9Zsum/fCy/eqnNbdZE/rzhmAPnGmIPGmCbgBWBZu2OWAc8ay0YgTkSSfTxXqXNU7FmLA0NE1ly7o6ge9NkpKZTWNPLEukO+nRAWC5//i9WE9ekb8Nh8OJ7r14x9iT8LRwpQ0OZ1oWebL8ec79x7PE1bT4mI12EzIrJcRLaKyNaysvNcwqo+o+yTt2g0wYybvtDuKKoHXT5mEIvHDeb3/9rHofJTvp0kApd+A25/HZpOweOXw4H3/Ru0j/Bn4fA2HWn7xsSOjuns3IeBDGAyUAL8xts/box5zBiTbYzJTkpK8imw6uUaqhlWsIr1QbMYkZxgdxrVw/7PsnGEBjlY+dIu3O4u9FsMuxRWrIPYVHjje9DiZXSWOos/C0chkNbmdSrQfhB1R8d0eK4x5rgxxmWMcQOPYzVrKUXT1mcJN3UcyLpdp1HvhwbGhPHjq8aw6dAJnt9ytGsnRw2ERffDiQPWErSqU/4sHFuALBFJF5EQ4CZgdbtjVgO3ekZXzQKqjDElnZ3r6QNpdT2Q48fvQfUWrhZcGx5mk3s047Ln251G2eTG7DQuzUjggTWfcqyqoWsnZ10B6ZfBBw9AfaVf8vUVfiscxpgW4B7gbWAP8KIxJldEVojICs9ha4CDQD7W1cNdnZ3rOeeXIrJbRHYBC4Dv+Ot7UL3Ip68TfqqQv8nVTB8eb3caZRMR4RefnUCz281PXs3BdGWorYjVWV5/EtZ5bQFXHueuAt+NPENl17Tb9kib5wa429dzPdtv6eaYqg8wGx6iiEG0ZC0iJEjva+3PhiVE8t0rRnL/mk95Y3cJV0/swv08yRNh8hdg0yMw/SswYJj/gvZi+glTvV/BFqRwM483L2bBGL3pT8Eds9OZkBLLD/+5i8fXHqSpxYf7O1ot/AmIE977mf8C9nJaOFTvt/Eh6h1RvMp8FozSEXQKgpwOHrllGjPS4/n5mj0s/sNa3t9b6tvJMUOsYbo5L0HhVv8G7aW0cKje7eQRTN4qnm1ewLIZI0mICrU7kQoQKXHhPP3lGTx1ezbGwJef3sKdz2zhSIUP93nM/iZEDoS3f6xTknihhUP1bpsfw22Ev5lF3L0g0+40KgAtHD2It749l5VLRrPxYAXXPvgRe4/VdH5SaDQsuBcKNsKe13omaC+ihUP1Xg3VuLc+wxuumVx5yTQGxegSscq70CAnKy7L4M1vzSM0yMEtT26i4ERd5ydNuQWSxljzWVUc6JmgvYQWDtV77fgLjuZanpOrWXFZht1pVC8wNCGC5+6cSWOLmy89uYmymsaOD3YGwbKHPNORLISDH/RYzkCnhUP1Tk11NH/0EJvco5kx+3Lt21A+GzU4mqdun05pdSO3PrWZqvrmjg9OnQZf/bfVYf7cZ2HTo9rngRYO1Ru5WuCfd+CsLeZxuYGvzh1hdyLVy0wbNoBHbplGfmkNX/3zVuqbXB0fHJ8Od75jLQL15g/gtW/2+/mstHCo3sUYWPN92Pcm/9V8GxPmLtN1N9QFuWxkEr+9cTJbjpzgG8/vwNXZxIih0fD5v8Lc78P2Z+HZa6HWx+G9fZAWDtW7rPs1bHuaN2Ju4vXQpdwxZ7jdiVQvds2kIfzs2nH8a89x7l+zp/ODHQ64/KfwuSeheAc8ehkUbOmZoAFGC4fqPXb+Df79P5QOv467S6/ha/MyiA4LtjuV6uVuvWQ4X549nCfXH+IvG4+c/4QJ/2E1XTmD4eklsOWJftfvoYVD9Q7578Hqb9CQNperjtxI5sBobrtU5xFS3eMnV41l4eiB/Pfq3POvXQ6QPAmWfwAj5ltreLx6FzTX+ztmwNDCoQLfsd3w4q24E0fxxZp7aDBBPHbLNCJC/DpHp+pHnA7hjzdPIWtgFHf/dTv7jp/nBkGAiHj4wotw2Ur45G/w5BVw8rDfswYCLRwqsLU0wktfwYRG89PI/2b7cRd/unkKI5Ki7E6m+pio0CCeun06YSFO7nhmC+W1ndzj0crhgAU/sgpI5VF4dB7sed3/YW2mhUMFtrW/hrJPWZN+L3/d08IPFo1m/qiBdqdSfdSQuHCevC2b8tpGbn1ys2/FA6yhuss/hAHp8PcvwpsrrT96+igtHCpwHdsN63/LseHXcc+WBK6ZNIQVl+k9G8q/JqbG8egt2Rwsr+XGRzZQXOlj30Xr/R4zvw6bHoYnr4QTB/0b1iZaOFRgcrXAqntwhcVxw+FrGDM4hl9+bqKuJa56xGUjk3juzpmU1TRywyMbOFhW69uJQaGw5AHrno+Th6whu7mv+DesDbRwqMC04U9QspNnYu+mzBXJo7dMIzzEaXcq1Y9MHx7P88tn0dDs4sZHN5BXXO37yWOuhhXrIWkU/ON2eP270NzFNdADmBYOFXjK98P7v+Dk0EX830MjWT4vg7T4CLtTqX5ofEosL664hGCng5se28BL2wqprPNxupG4ofDlN+HSb8LWJ+GJz0B5vn8D9xDp0mLuvVR2drbZulVX8uoV3G54ZimmNI87Ih4irzac978/X4feKlsVnqzj9qe3kF9ai0Osua7mjxrIwtEDGT04+vxNqPvehldWgKsJrv49TLyhR3JfLBHZZozJbr9drzhUYNn6JBzdwI4xP+D9Ygf/uWi0Fg1lu9QBEbz97Xm8fNel3L0gk7omF796ey9L/rCOGx/dQG5xVedvMHKR1XQ1eAK8/BVYdQ80nWc9kACmVxwqcOx7G168DdfQS5hbeDfx0aGsvnsODod2iKvAc7y6gTd3l/DHf+dTWdfEzTOG8r0rRxEf2cmkm64W+OAXsO43ED0YMhZC+jwYPhdiU3ouvI86uuLQwqECw+bHrSmrB0/giaG/5H8+rODvy2cxc0SC3cmU6lRVXTO/f28fz244QlRoEN+9YiRfnDmUIGcnDToHP7TmuDq8DupPWtviMyB9LiRkWUUkJtVaByR6MDic1nxYLQ3WlUpTLbiaIX6EdROin2jh0MIRmNxuePensOFBGLmEsisf4rI/bmFeVhKP3DLN7nRK+Wzf8RruW53LxwcqGJ8SwwOfncj4lNjOT3K7oTQXDq21Hkc+hsZ2o7fECcER0HwKjPvsfQPSIfvLMPlLENn9f2Rp4dDCEXia6uCV5bDnNZjxNVj8C37wcg6v7CjiX9+9jGEJkXYnVKpLjDG8sbuE+1bncbKuiTvnpPOdz4z0fSi5MdBQCVVFUO15VBVZy9eGRJ79cDXBrn/A0Y/BGQJjr4PsO2DoLOim+520cGjhCBzGQNleWHU3FG2DRffTkP01/vDefh758ABfmZPOj68aa3dKpS5YVV0zv3hzDy9sKSAtPpz7r5/A3Kwk//xjpXtg61PwyQvW1Ur8CMj8jNV/MnwuhF74vG5aOLRw2KulEY58ZHWA73vLmkU0KBw+9zgbQy/lRy/v5lD5KW7MTuW+a8fpSCrVJ2w8WMG9L+/mYPkprp6YzL1LxzAkLtw//1jTKch5ybqCP7wemuvAEQxpM63p35NGQmwaxA2zZvb14apEC4cWjp5Vd8JaJa14h3VVcWit1aEXFGb9EI9cRM2wz3D/uiqe33yUofER/OKzE5idmWh3cqW6VUOzi4c/OMAjHx5ABO6an8nyeSMIC/bjTAgtjXB0Ixz4t/U4tuvs/cERVhGJSYaIRIhMhIgE6xE10OqoT8hAgsO0cKhuYIz1l01dOZyq8Hwtg1Pl1vPKAqtYVLZZSS0+A5M+j/IhC9jumMDu0mZyi6vYUVBJdX0zd85J57tXjNIpRVSfVniyjvvX7GHN7mOkDgjnh4tHMzE1lsjQICJDgggLdvhvLrb6Smva98qjUFVgfU4rj0Dtcc9nt8JLp7wDua+y5wuHiCwG/gA4gSeMMQ+02y+e/UuBOuB2Y8z2zs4VkXjg78Bw4DBwozHmZGc5tHD4yO2CmhLPD9VRqPL8oFUXe4qDp1C0dDDnjjPU+gsmeTL1SRPZ58zi47oUNha72FlQSVV9s3WYQ8hMimJcSgy3XTKcSWlxPfc9KmWzjw+U87PVeextt1iUQyAiJIiQIAfBTiHY6SDY6cDpENxuQ7PbjctlaHEbRGBARAgJUSEkRIYSHxnCoJgwsgZGMWpwNClx4V2//6ml0WopqD1mTY1Svhe5/Kc9WzhExAnsA64ACoEtwM3GmLw2xywFvoFVOGYCfzDGzOzsXBH5JXDCGPOAiKwEBhhjfthZln5TONwu66+GhqouPUxDJTRUIY3nrnrWEp5Ic2QyjWGJNIYMoD4ojlPBA6h1xFLtjKPKEUOZK5rCxkhK6p1UnGqivLaJIs9U1CIwalA0k9PimJgax7ghMYwaHO3fy3SlAlyLy826/eVUnGqirqmF2sYW6hpdnGpqodnlpsVlaHK5aXYZXG43ToeDIIdYD6fgchtOnGrmxKlGTpxqoqK2iZrGltPvHx7sJGtQFINjwnAbcBuDy21wG4NDhIgQJ+EhTiJCnESGBBHsdCACIoIADrH+nXsWZnktHP7sgZwB5BtjDgKIyAvAMiCvzTHLgGeNVb02ikiciCRjXU10dO4yYL7n/D8DHwCdFo7m4hyO3ZfZbqv3gild3u79nS7kfbw793jB4MDgxIUTNw7cBOEiCFeH7wLgRqglghoiqDaRVBNBlYmgyj2CaiKpIZzjZgBFJpFCk0SRSaSxIQQ6vZ6DEGczCVF1xEeGkBAVyoikKDIHRjElLY4JqbFEhwV3/gZK9TNBTgcLRnfvgmTVDc3sP17LvuM17Dtew/7jtRypqMPhEJwOcIrg8Fy9FFe6qGtyUdfUQl2Ti2aXG4PVEu1T/m5NfrYUoKDN60Ksq4rzHZNynnMHGWNKAIwxJSLi9X9fRJYDywFGJsdwNPacotnhqAKD91/kBvF6ikE6PN6rDtsxO8oj55zmxoERBy5xYnDiFicuCaLBGU29M4qG04/o088bnZEYcSBYf020XgYHO4Qgp4Ngh5DuFLIcDoKcQojTQXiIk/Bg5+mvYW2et772a9usUsonMWHBTBs2gGnDBlzU+xhjcBtocbsJ+3/ej/Fn4fD+u9S3Y3w5t1PGmMeAx8BqqprxnRe6crpSSvVLIoJTwOnouDnZn7PjFgJpbV6nAsU+HtPZucc9zVl4vpZ2Y2allFLn4c/CsQXIEpF0EQkBbgJWtztmNXCrWGYBVZ5mqM7OXQ3c5nl+G7DKj9+DUkqpdvzWVGWMaRGRe4C3sYbUPmWMyRWRFZ79jwBrsEZU5WMNx/1yZ+d63voB4EURuRM4CvSOFVGUUqqP0BsAlVJKeaUrACqllOoWWjiUUkp1iRYOpZRSXaKFQymlVJf0i85xEakB9tqdo4sSgXK7Q3RBb8sLmrkn9La80Psy+zPvMGPMOStQ9ZfVcvZ6GxkQyERka2/K3NvygmbuCb0tL/S+zHbk1aYqpZRSXaKFQymlVJf0l8LxmN0BLkBvy9zb8oJm7gm9LS/0vsw9nrdfdI4rpZTqPv3likMppVQ30cKhlFKqS/p04RCRxSKyV0TyPeuTBxwRSROR90Vkj4jkisi3PNvjReRdEdnv+Xpxy3p1MxFxisgOEXnd8zrQ88aJyD9F5FPP//UlvSDzdzw/Ezki8ryIhAVaZhF5SkRKRSSnzbYOM4rIjzyfx70isihA8v7K83OxS0ReEZG4QMnbUeY2+74vIkZEEtts83vmPls4RMQJPAQsAcYCN4vIWHtTedUCfM8YMwaYBdztybkSeM8YkwW853kdSL4F7GnzOtDz/gF4yxgzGpiElT1gM4tICvBNINsYMx5reYGbCLzMzwCL223zmtHzc30TMM5zzv96Pqc96RnOzfsuMN4YMxHYB/wIAiYveM+MiKQBV2AtL9G6rUcy99nCAcwA8o0xB40xTcALwDKbM53DGFNijNnueV6D9QstBSvrnz2H/Rm4zpaAXohIKnAV8ESbzYGcNwaYBzwJYIxpMsZUEsCZPYKAcBEJAiKwVsEMqMzGmLXAiXabO8q4DHjBGNNojDmEtQ7PjJ7I2cpbXmPMO8aYFs/LjVgrjkIA5PXk8/Z/DPA74Aecvax2j2Tuy4UjBSho87rQsy1gichwYAqwCRjkWQ0Rz9eBNkZr7/dYP7DuNtsCOe8IoAx42tO89oSIRBLAmY0xRcCvsf6aLMFaHfMdAjhzGx1l7A2fyTuANz3PAzaviFwLFBljPmm3q0cy9+XCIV62BezYYxGJAl4Cvm2MqbY7T0dE5Gqg1Bizze4sXRAETAUeNsZMAU5hfxNPpzz9AsuAdGAIECkiX7I31UUL6M+kiPwYq+n4r62bvBxme14RiQB+DPyXt91etnV75r5cOAqBtDavU7Eu9QOOiARjFY2/GmNe9mw+LiLJnv3JQKld+dqZDVwrIoexmv8WishfCNy8YP0sFBpjNnle/xOrkARy5s8Ah4wxZcaYZuBl4FICO3OrjjIG7GdSRG4Drga+aM7c3BaoeTOw/qD4xPM5TAW2i8hgeihzXy4cW4AsEUkXkRCsDqPVNmc6h4gIVtv7HmPMb9vsWg3c5nl+G7Cqp7N5Y4z5kTEm1RgzHOv/9N/GmC8RoHkBjDHHgAIRGeXZdDmQRwBnxmqimiUiEZ6fkcux+r8COXOrjjKuBm4SkVARSQeygM025DuLiCwGfghca4ypa7MrIPMaY3YbYwYaY4Z7PoeFwFTPz3nPZDbG9NkHsBRrlMQB4Md25+kg4xysS8ldwE7PYymQgDUiZb/na7zdWb1knw+87nke0HmBycBWz//zq8CAXpD5Z8CnQA7wHBAaaJmB57H6YJqxfoHd2VlGrCaWA1jLHCwJkLz5WP0CrZ+/RwIlb0eZ2+0/DCT2ZGadckQppVSX9OWmKqWUUn6ghUMppVSXaOFQSinVJVo4lFJKdYkWDqWUUl2ihUMppVSXaOFQSinVJf8fQr0r1EAWxzMAAAAASUVORK5CYII=", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ax = plt.subplot()\n", "pd.merge(\n", " X,\n", " y,\n", " left_on=\"stay_id\",\n", " right_on=\"stay_id\"\n", ").groupby('IonoC')[\"heartrate\"].plot(kind='kde', ax=ax)\n", "ax.set_xlim(0, 150)\n", "ax.legend()" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ax = plt.subplot()\n", "pd.merge(\n", " X,\n", " y,\n", " left_on=\"stay_id\",\n", " right_on=\"stay_id\"\n", ").groupby('IonoC')[\"resprate\"].plot(kind='kde', ax=ax)\n", "ax.set_xlim(0, 30)\n", "ax.legend()" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 81, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZIAAAD4CAYAAADGmmByAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAs1klEQVR4nO3deXxU9bnH8c+TjRAIS8IqQRKQRUStiAhata1aBRfcwaooarkuVG3tYq2veu3itb3W2sVKqftyBUFUtIhrtdoEJKCigGgmgIQ1TNhDyPbcP86AQ5gkk2ROzpmZ5/168UrmbPPND5gn5/zO+f1EVTHGGGNaK8XrAMYYY+KbFRJjjDFtYoXEGGNMm1ghMcYY0yZWSIwxxrRJmtcBYqlHjx6an5/vdQxjjIkbS5Ys2aqqPdtyjIQqJPn5+RQXF3sdwxhj4oaIrG3rMezSljHGmDaxQmKMMaZNXC0kInK2iKwSkRIRuSPC+mEiUiQi+0TkxxHWp4rIRyLyqps5jTHGtJ5rfSQikgo8BJwJlAGLRWSeqq4I26wCuAW4oJHD3AqsBLq0NkdNTQ1lZWVUVVW19hCuy8zMJC8vj/T0dK+jGGNMi7nZ2T4aKFHVUgARmQlMAA4UElXdAmwRkXMa7iwiecA5wG+BH7U2RFlZGdnZ2eTn5yMirT2Ma1SVYDBIWVkZBQUFXscxxpgWc/PSVj9gXdjrstCyaD0I/BSob2ojEZkqIsUiUlxeXn7I+qqqKnJzc31ZRABEhNzcXF+fMRljTFPcLCSRPrmjGmpYRM4Ftqjqkua2VdUZqjpKVUf17Bn5Vmi/FpH9/J7PGGOa4mYhKQP6h73OAzZEue/JwPkisgaYCXxHRJ6JbTxjjIljVTth6VNQ4/3VDDcLyWJgsIgUiEgGMAmYF82OqvpzVc1T1fzQfu+o6pXuRXXXggULGDp0KEcccQT33Xef13GMMfGuvh7mfh/m/QD++SPweF4p1wqJqtYC04DXce68el5Vl4vIDSJyA4CI9BGRMpzO9LtEpExEWn2Hlh/V1dVx880389prr7FixQqee+45VqxY0fyOxhjTmH//Hr5YAP3HwMfPQvFjnsZxdYgUVZ0PzG+wbHrY95twLnk1dYx3gXddiNcuPvzwQ4444ggGDhwIwKRJk3j55ZcZPny4x8mMMXFp1QJ493/g2MthwkPwfxPhtZ9Bn6Oh/2hPIiXUWFvNueeV5azYsDOmxxx+WBfuPu+oRtevX7+e/v2/7irKy8tj0aJFMc1gjEkSwQDMnQp9joFz/wgpqXDxP2DGt+D5yTD1Pcju3e6xbIgUl2mEa5d2l5YxpsX27YaZVzjFY+IzkN7RWd6xO0x8FvZuh9lXQ11Nu0dLqjOSps4c3JKXl8e6dV8/TlNWVsZhhx3W7jmMMXFMFeZNg62r4Mq50H3Awev7jIDz/wJzr4c37oJxv2vXeHZG4rITTjiBL7/8ktWrV1NdXc3MmTM5//zzvY5ljIknhX+B5S/C6XfDoG9H3uaYS2HMTbBoOnwyq13jJdUZiRfS0tL461//yllnnUVdXR3XXnstRx3V/mdGxpg4VfoevHU3DJ8AJ9/a9LZn/go2LoNXboFew6Dvse0S0QpJOxg/fjzjx4/3OoYxJt5sXwdzpkCPIc4dWs31r6amw6WPw99Pg1lXOp3vWTmux7RLW8YY40c1VU4xqKtxOtM7ZEe3X+deMPFp2LUJXrgO6uvczYkVEmOM8R9V+OftsPFjuPDv0OOIlu2fNwrG/y8E3oF//daViOGskBhjjN8UPwofPwOn/QyGtfKy+PHXwMjJ8P4fYOUrMY3XkBUSY4zxk68WwWt3wODvwmmHTCzbMuP+Fw4bCS/eCOVfxCZfBFZIjDHGL3Ztcp5Q75oHF82AlDZ+RKdnOv0laR1g1hXOiMEusEJijDF+UFsNz18N+3bCpGedJ9ZjoWuecydXMAAv3ejKSMFWSNrBtddeS69evRgxYoTXUYwxfvXGL2DdQucJ9d4xftas4FTnGZPPX4UP/hjbY2OFpF1cc801LFiwwOsYxhi/+vg5+HAGjJ0GR1/iznuMvRlGXAzv/BpK3o7poa2QtINTTz2VnBz3HwoyxsShDR/Dq7dB/ilwxj3uvY+Ic7bTc5jzfMm2NTE7dHI92f7aHbDp09ges8/RMM5mPTTGtMKeIMy6CrJ6wKVPQKrLH8kZnZyRg2d823nY8do3YnJYOyMxxhgv1NfBC9fC7s0w8Sno1KN93jd3kDOHyabP4NUfxuSQyXVGYmcOxhi/ePtXUPounP9X6Hd8+773kLPgWz+Hd++NyeHsjMQYY9rbipfhPw/C8VNg5FXeZDj1JzDk7JgcytVCIiJni8gqESkRkUMe0RSRYSJSJCL7ROTHYcv7i8i/RGSliCwXkWbGTva3yy+/nLFjx7Jq1Sry8vJ49NFHvY5kjPHKls/hpZsg74R2n4DqICkpzjheMeDapS0RSQUeAs4EyoDFIjJPVVeEbVYB3AJc0GD3WuB2VV0qItnAEhF5s8G+ceO5557zOoIxxg+qdsDM70F6Flz2lPPEuZc6dovJYdw8IxkNlKhqqapWAzOBCeEbqOoWVV0M1DRYvlFVl4a+3wWsBPq5mNUYY9xVXw8v3gDb18JlT0KXxJly281C0g9YF/a6jFYUAxHJB44DFsUmljHGeOD9+2HVfDjrXhhwktdpYsrNQhJpKq8WDfIiIp2BF4DbVDXiaGMiMlVEikWkuLy8POJx1IWxZWLJ7/mMMW30xRvwr3vhmIkweqrXaWLOzUJSBvQPe50HbIh2ZxFJxykiz6rq3Ma2U9UZqjpKVUf17NnzkPWZmZkEg0HfflirKsFgkMzMTK+jGGPcEAzA3Ouhzwg498Hmp8uNQ24+R7IYGCwiBcB6YBLwvWh2FBEBHgVWquoDbQmRl5dHWVkZjZ2t+EFmZiZ5eXlexzDGxFr1HufJdUlxnijPyPI6kStcKySqWisi04DXgVTgMVVdLiI3hNZPF5E+QDHQBagXkduA4cAxwFXApyLyceiQd6rq/JbmSE9Pp6CgoM0/jzHGtIgqzPsBlK+EK+ZA93yvE7nG1SfbQx/88xssmx72/SacS14NfUDkPhZjjIkPRQ/BZy/A6XfDEad7ncZV9mS7McbE2up/w5u/hCPPg2/GZjwrP7NCYowxsbSjDGZPcQZHvODhhOxcb8gKiTHGxEpNldO5XrsPJv0fdMj2OlG7SK7Rf40xxi2qMP/HsGEpTHwWegz2OlG7sTMSY4yJhSVPwEdPwyk/hiPP9TpNu7JCYowxbbVuMcz/CRxxBnz7Tq/TtDsrJMYY0xa7NsPzV0HXfnDRPyAl1etE7c76SIwxprXqamD2NbB3O1z/JmTleJ3IE1ZIjDGmtd64C74qhIsegT5He53GM3ZpyxhjWuOTWbBoOoy5CY651Os0nrJCYowxLbVxGbxyKwz4Jpz5K6/TeM4KiTHGtERlBcy6Ajp2h0ufgNR0rxN5zvpIjDEmWvV18MJ1sGsTTFkAnQ+dAykZWSExxphovfMbCLwD5/0Z8o73Oo1v2KUtY4yJxop58MEDMPJqOP5qr9P4ihUSY4xpTvkqeOlG6Hc8jP9fr9P4jhUSY4xpStVOmHkFpHeEy56GtA5eJ/Id6yMxxpjG1Nc7ZyIVpXD1PGcYFHMIKyTGGNOYD/4An78KZ98H+d/0Oo1v2aUtY4yJ5Mu34J3fwtGXwok3eJ3G11wtJCJytoisEpESEbkjwvphIlIkIvtE5Mct2dcYY1xTsdp5XqT3COdW3ySYLrctXCskIpIKPASMA4YDl4vI8AabVQC3APe3Yl9jjIm96kqYdaXz/cSnISPL2zxxwM0zktFAiaqWqmo1MBOYEL6Bqm5R1cVATUv3NcaYmFOFV26Bzcvh4kchp8DrRHHBzULSD1gX9rostCym+4rIVBEpFpHi8vLyVgU1xhjAGc3309nwnV/A4DO8ThM33CwkkS4qaqz3VdUZqjpKVUf17Gnj3hhjWmnNB/D6L2DYufDN271OE1fcLCRlQP+w13nAhnbY1xhjWmbHememw5yBcMHDkGI3tLaEm621GBgsIgUikgFMAua1w77GGBO92n3OnOs1e2HSs5DZxetEcce1BxJVtVZEpgGvA6nAY6q6XERuCK2fLiJ9gGKgC1AvIrcBw1V1Z6R93cpqjEli838C65fAxGeg51Cv0xxQVVPH5Ec/JDVFGNonm2F9shnaJ5shvbPp1MFfz5K7mkZV5wPzGyybHvb9JpzLVlHta4wxMbXkCVj6JHzzR3DkeV6nOUjxmm18uKaCQT078UnZdiqr6w6sOzwn66DiMqxPNvm5nUhL9eaSnL/KmjHGtJeyYudsZNDp8J27vE5ziMLAVtJShJenfZOs9FTKtu1l5aadrNq0i1WbdvH5pp28vXIz9aHbkDLSUhjcq3NYgenCsD7Z9MrugLj8QKUVEmNM8tm9BWZdBdl94eJHICXV60SHKAwEObZ/NzqHLmMdnpvF4blZnHVUnwPbVNXUUbJlt1NcNu/i8027+ODLrcxduv7ANt2y0hna++viMjR0FtM5hpfHrJAYY5JLXQ3MngJ7t8F1b0BWjteJDrGrqoZP1+/gxtMGNbldZnoqI/p1ZUS/rgct37anms837WLVpp0HCsycJWXsCbs8lte9I8P6xObGAiskxpjk8uYvYe0HcOEM6HuM12kiWrymgrp65aRBua3av3unDMYOymVs2P719cr67XsPFJjPQ5fIYsEKiTEmeSybDQv/5ozme+xEr9M0qrAkSEZaCiMHdI/ZMVNShP45WfTPyeLM4b0PLJcYPHtpT90YY5LDpk9h3g/g8JPgu7/xOk2TCgNBjj+8O5np/uu7icQKiTEm8VVWONPlduwGlz4BqeleJ2rUtj3VrNi4s9WXtbxgl7aMMYmtvg7mfh92boApr0F27+b38dCi1UGAg/o3/M4KiTEmsf3rXih5C859EPqf4HWaZhUGgmRlpHJMXjevo0TNLm0ZYxLXylfh/fvhuKvg+Gu8ThOVwkCQE/JzyEiLn4/n+ElqjDEtUf4FvHgDHDYSxt8fF9PlbtlZRcmW3XHVPwJWSIwxiWjfLph1BaR1cKbLTc/0OlFUikqd/pGTBvXwOEnLWB+JMSaxqMJLN0IwAJNfhq4Rx4X1paJAkC6ZaQw/LL6GsrdCYoxJLB/8EVa+AmfdCwWneJ2mRQoDQU4cmEtqiv8vw4WzS1vGmMRR8ja882sYcTGMucnrNC2yrqKSryoq465/BKyQGGMSxbY1MOda6HkknP+XuOhcDxev/SNghcQYkwiqK2HWlYDCpGcgo5PXiVqsKBAkt1MGQ3p39jpKi1kfiTEmvqnCq7fBps/gitmQM9DrRC2mqhQFgowZlOv6JFRusDMSY0x8+3AGLJsF374TBp/pdZpWWb11D5t2VsVl/whYITHGxLO1hfD6nTB0PJzyY6/TtFphIH77R8DlQiIiZ4vIKhEpEZE7IqwXEflzaP0yERkZtu6HIrJcRD4TkedEJD6eKDLGtI+dG+D5q6F7Plw4HVLi9/fiokCQvl0zyc/N8jpKq7jW8iKSCjwEjAOGA5eLyPAGm40DBof+TAUeDu3bD7gFGKWqI4BUYJJbWY0xcaZ2Hzw/GWoqYeKzkNm1+X18qr5eKSoNMjZO+0fA3TOS0UCJqpaqajUwE5jQYJsJwFPqWAh0E5G+oXVpQEcRSQOygA0uZjXGxJPXfgZli+GCv0GvYV6naZMvtuyiYk81YwfGZ/8IRFlIROQFETlHRFpSePoB68Jel4WWNbuNqq4H7ge+AjYCO1T1jUayTRWRYhEpLi8vb0E8Y0xcWvoULHkcTr4Nhjf83TT+FJbE3/wjDUVbGB4Gvgd8KSL3iUg0vwJEOkfTaLYRke44ZysFwGFAJxG5MtKbqOoMVR2lqqN69uwZRSxjTNxavwT+eTsM/Bac/kuv08REYSDIgNws8rrHZ/8IRFlIVPUtVb0CGAmsAd4UkUIRmSIijc1ZWQb0D3udx6GXpxrb5gxgtaqWq2oNMBc4KZqsxpgEtbscZl0FnfvAJY9DSnzMZ96U2rp6FpUG4/a23/2ivlQlIrnANcD1wEfAn3AKy5uN7LIYGCwiBSKSgdNZPq/BNvOAyaG7t8bgXMLaiHNJa4yIZInT+3Q6sDL6H8sYk1DqamHOFKgMOk+uZ+V4nSgmlm/Yya59tYyN09t+94vqyXYRmQsMA54Gzgt92APMEpHiSPuoaq2ITANex7nr6jFVXS4iN4TWTwfmA+OBEqASmBJat0hE5gBLgVqcwjWjdT+iMSbuvXU3rHkfLvw79D3W6zQxs398rTED47swRjtEyiOqOj98gYh0UNV9qjqqsZ1C+8xvsGx62PcK3NzIvncDd0eZzxiTqD6dA0V/hdFT4djEegqgMBBkcK/O9MqO78fkor209ZsIy4piGcQYYw6x6TOY9wM4fCx897dep4mp6tp6Fq+uiPv+EWjmjERE+uDcottRRI7j67usuuA822GMMe7Yu80Z0bdDF7j0SUjL8DpRTH1Stp29NXVx3z8CzV/aOgungz0PeCBs+S7gTpcyGWOSXX09zJ0KO8pgynzI7u11opgrLAkiEv/9I9BMIVHVJ4EnReRiVX2hnTIZY5Lde/fBl2/AOQ9A/9Fep3FFUelWhvftQres+D/Tau7S1pWq+gyQLyI/arheVR+IsJsxxrTe5/Phvd/BN66EUdd6ncYVVTV1LF27natPGuB1lJho7tLW/mnG4m/KLmNM/Nn6Jbz4X3DYcXDOH+JuutxoLVm7jeq6+rgdNr6h5i5t/T309Z72iWOMSVr7djmd66npcNnTkB7ft8Q2pTCwldQU4YSC+O8fgegHbfy9iHQRkXQReVtEtjY29pUxxrSYKrx8M2z9whn+pFv/5veJY4WBIMfmdaVzh8SY7Tza50i+q6o7gXNxxscaAvzEtVTGmOTynz/BipfhjHtg4Glep3HV7n21LCvbEdej/TYUbSHZPzDjeOA5Va1wKY8xJtkE3oG374GjLoKTfuB1GtctXl1BXb0mTP8IRD9Eyisi8jmwF7hJRHoCVe7FMsYkhW1rYc510HMYTPhrwnauhysMbCUjNYXjB3T3OkrMRDuM/B3AWJypb2uAPRw626ExxkSvZq/TuV5fBxOfgYxOze+TAAoDQUYO6EZmevwPg79fS3p6jsR5niR8n6dinMcYkwxU4dUfwqZl8L3nIXeQ14naxfbKalZs3MkPzxjidZSYinYY+aeBQcDHQF1osWKFxBjTGosfgU+eg2/9HIac5XWadrOwtALV+J5WN5Joz0hGAcNDw74bY0zrrS2CBXfAkHFw6k+9TtOuigJb6ZieyrF53byOElPR3rX1GdDHzSDGmCSwcyPMvhq6DYCL/g4pUU/SmhAKA0FOKMghIy2xfu5oz0h6ACtE5ENg3/6Fqnq+K6mMMYmnttopIvt2w+SXIbOr14na1ZZdVXy5ZTcXH5/ndZSYi7aQ/LebIYwxSWDBHbBuEVz6BPQ60us07a4o4EyrmwgTWTUUVSFR1fdEZAAwWFXfEpEsnHnYjTGmeR89A8WPwkm3wFEXep3GEwtLg2RnpnHUYYl3JhbtWFvfB+YAfw8t6ge85FImY0wiWb8UXv0RFJwGp9/tdRrPFAaCnFiQS2pK4j10GW2Pz83AycBOAFX9EujV3E4icraIrBKREhG5I8J6EZE/h9YvE5GRYeu6icgcEflcRFaKyNgosxpj/GLPVph1FXTu5QzGmJoYgxS2VNm2StYGKxPyshZEX0j2qWr1/hehhxKbvBVYRFKBh4BxwHDgchEZ3mCzccDg0J+pwMNh6/4ELFDVYcCxwMoosxpj/KCuFuZMgT3lMPFp6JSYH6LRONA/ckRitkG0heQ9EbkT6CgiZwKzgVea2Wc0UKKqpaEiNJNDh1WZADyljoVANxHpKyJdgFOBRwFUtVpVt0eZ1RjjB2/fA6v/Dec96ExUlcSKAkFyO2UwpFe211FcEW0huQMoBz4F/guYD9zVzD79gHVhr8tCy6LZZmDo/R4XkY9E5BERiTgQj4hMFZFiESkuLy+P8scxxrjqs7lQ+Gc44Xr4xve8TuMpVaWoNMiYgbmkJGD/CEQ/aGM9Tuf6Tap6iar+I4qn3CO1WMN9GtsmDRgJPKyqx+EMEnlIH0so2wxVHaWqo3r27NlMJGOM6zavgJenQf8T4az/8TqN59YEK9m4oyrhhkUJ12QhCXWG/7eIbAU+B1aJSLmI/DKKY5cB4dOc5QEbotymDChT1UWh5XNwCosxxs/2bodZV0CHznDpk5CW4XUizxUGtgKJ+fzIfs2dkdyGc7fWCaqaq6o5wInAySLyw2b2XQwMFpECEckAJgHzGmwzD5gcKlhjgB2qulFVNwHrRGRoaLvTgRXR/1jGmHZXXw8v/hds/wouewq69PU6kS8UBoL06ZJJQY/EHSa/uXvxJgNnqurW/QtUtTQ0X/sbwB8b21FVa0VkGvA6zsOLj6nqchG5IbR+Ok5fy3igBKgEpoQd4gfAs6EiVNpgnTHGb/79e/hiAYy/Hw4f43UaX1BVFgaCnDakJ5LAk3Y1V0jSw4vIfqpaLiLpkXZosN18nGIRvmx62PeK84xKpH0/xhl12Bjjd6sWwLv/A8d+z+lgNwB8sXk3wT3VjEngy1rQ/KWt6lauM8Yki2AA5k6FvsfCuQ8kxXS50UqG/hFo/ozkWBHZGWG5AJku5DHGxJN9u2HmFZCS6kyXm97R60S+UhgIcnhOFnnds7yO4qomC4mq2sCMxpjIVGHeNNi6Cq6cC90O9zqRr9TVKwtLg5xzdOLfdJBYs6sYY9pP4V9g+YvOQIyDvu11Gt9ZvmEHu6pqE/r5kf2skBhjWq70XXjrbhh+AZx8q9dpfGn/+FpjB1ohMcaYg23/CmZPgR5DYMJD1rneiMJAkCN6daZXl8TvTrZCYoyJXs1eZ1j4+lqY+KzzBLs5RHVtPYvXVCT83Vr7JefkAMaYllOFf94OGz+Gy2dCjyO8TuRby8q2U1ldlzSFxM5IjDHRKX4UPn4WTvsZDB3ndRpfKwwEEYETC6yQGGOM46tF8NodMPi7cFrEgbhNmKJAkCP7dKF7p+QYtNIKiTGmabs2wfOToWseXDQDUuxjoylVNXUs+Wpb0lzWAusjMcY0pbYanr8a9u2Eq16Ejt29TuR7S9duo7q2PmGn1Y3ECokxpnFv/ALWLYRLHoPew71OExcKA0FSU4QT8nO8jtJu7BzVGBPZx8/BhzNg7DQYcbHXaeJGYWArx+R1JTuz2QHSE4YVEmPMoTZ8DK/eBvmnwBn3eJ0mbuzeV8uysh1J8TR7OCskxpiD7Qk6Dx1m9YBLn4BUuwIercVrKqitV04a1MPrKO3K/oUYY75WVwsvXAu7N8O1C6BTcn0gtlVRIEhGagrHD0iumxKskBhjvvbOr50BGSc8BP1Gep0m7hQGtnLc4d3omJFcM3DYpS1jjGP5S/CfB2HUtXDclV6niTvbK6tZvmFn0l3WApcLiYicLSKrRKRERA55HFYcfw6tXyYiIxusTxWRj0TkVTdzGpP0tnwOL90EeSfA2b/zOk1cWrS6AlWSYv6RhlwrJCKSCjwEjAOGA5eLSMMb0ccBg0N/pgIPN1h/K7DSrYzGGKBqB8z8HmR0gsuehrTkGNYj1ooCQTLTU/hG/25eR2l3bp6RjAZKVLVUVauBmcCEBttMAJ5Sx0Kgm4j0BRCRPOAc4BEXMxqT3OrrYe5/wfa1cNmT0CXxp4V1S2FgKyfk55CRlnw9Bm7+xP2AdWGvy0LLot3mQeCnQH1TbyIiU0WkWESKy8vL2xTYmKTz/v3wxWtw1r0w4CSv08St8l37+GLz7qTsHwF3C0mkadM0mm1E5Fxgi6ouae5NVHWGqo5S1VE9e/ZsTU5jktMXb8C/7oVjJsHoqV6niWtFpaFpdZOwfwTcLSRlQP+w13nAhii3ORk4X0TW4FwS+46IPONeVGOSTDAAc6+HPiPgvAdtutw2KgoEye6QxojDungdxRNuFpLFwGARKRCRDGASMK/BNvOAyaG7t8YAO1R1o6r+XFXzVDU/tN87qmr3IxoTC9V7nCfXJcWZLje9o9eJ4l5RYCsnDswhLTX5+kfAxQcSVbVWRKYBrwOpwGOqulxEbgitnw7MB8YDJUAlMMWtPMYYnOlyX54G5Svhyheg+wCvE8W99dv3siZYyVVj872O4hlXn2xX1fk4xSJ82fSw7xW4uZljvAu860I8Y5JP0UOwfC6cfjcM+o7XaRJCUcDpH0mmiawaSs7zMGOS0ep/w5u/hCPPh2/+0Os0CaMwsJXuWekM7Z3tdRTPWCExJhnsKIPZ10DuEXDB36xzPUZUlYWBIGMH5ZKSkrxtaoXEmERXUwWzroS6Gpj0LHRI3t+cY21tsJINO6oYm6TPj+xno/8ak8hUYf7tsOEjmPR/0GOw14kSSqH1jwB2RmJMYlvyOHz0DJz6Exh2jtdpEk5hYCu9u3RgYI9OXkfxlBUSYxLVug9h/k/hiDPhWz/3Ok3CUVUWlgYZOzAXSfI+JyskxiSiXZvh+cnQtR9c/A9ISa6JltrDl1t2s3V3ddKOrxXO+kiMSTR1Nc4dWlU74Lo3oWNyTfvaXgpLtgLJO75WOCskxiSaN+6Crwrh4kedsbSMKwoDQfrndKR/TpbXUTxnl7aMSSSfzIJF02HMzXD0JV6nSVh19U7/yEkD7bIWWCExJnFs/AReuQXyT4Ezf+V1moS2YsNOdlbV2mWtECskxiSCygrnocOsXLjkcUi1q9ZuKiq1/pFw9q/NmHhXXwcvXAe7NsGUBdDZJnhzW2EgyKCenejdJdPrKL5gZyTGxLt3fgOBd2D8/ZB3vNdpEl5NXT0frq6w237DWCExJp6tmAcfPADHXwPHX+11mqSwrGw7ldV1ST8sSjgrJMbEq/JV8NKN0G8UjPu912mSRmGJM77WiQOtkOxnhcSYeFS1E2Ze4UyTe9lTkNbB60RJo6g0yJF9u5DTKcPrKL5hhcSYeFNfDy/eABWlcOmTzjAopl1U1dRRvHabXdZqwO7aMibefPAHWPVPOPs+yD/Z6zRJZelX26iurbdC0oCdkRgTT758E975LRx9GZx4g9dpkk5RIEhqijC6IMfrKL7iaiERkbNFZJWIlIjIHRHWi4j8ObR+mYiMDC3vLyL/EpGVIrJcRG51M6cxcaGi1HlepPcIOO9PNl2uBwoDQUb060p2ZrrXUXzFtUIiIqnAQ8A4YDhwuYgMb7DZOGBw6M9U4OHQ8lrgdlU9EhgD3BxhX2OSR/UemHUVIDDxaciwgQLb2559tXyybrtd1orAzTOS0UCJqpaqajUwE5jQYJsJwFPqWAh0E5G+qrpRVZcCqOouYCVgPYomOanCK7fC5uVwyaOQU+B1oqS0eE0FtfVqhSQCNwtJP2Bd2OsyDi0GzW4jIvnAccCiSG8iIlNFpFhEisvLy9ua2Rj/WfgwfDobvnMXHHGG12mSVlEgSHqqMGqA9Y805GYhiXQBV1uyjYh0Bl4AblPVnZHeRFVnqOooVR3Vs6eNMWQSzOr3nflFhp0Lp9zudZqkVhgIctzh3emYYbNNNuRmISkD+oe9zgM2RLuNiKTjFJFnVXWuizmN8acd652ZDnMHwQUPW+e6h3ZU1vDZhh2MtafZI3KzkCwGBotIgYhkAJOAeQ22mQdMDt29NQbYoaobRUSAR4GVqvqAixmN8afaffD8Vc7Xic9CZhevEyW1RauDqGL9I41w7YFEVa0VkWnA60Aq8JiqLheRG0LrpwPzgfFACVAJTAntfjJwFfCpiHwcWnanqs53K68xvjL/J7B+CUx8BnoO8TpN0isMBMlMT+Ebh3fzOoovufpke+iDf36DZdPDvlfg5gj7fUDk/hNjEt+SJ2Dpk06fyJHneZ3G4HS0n5CfQ4c06x+JxJ5sN8ZPyoqds5FBp8O3f+F1GgOU79rHqs27bDbEJlghMcYvdm9xHjrM7gsXPwIp9tuvHywsdYaNt472xtmgjcb4QV2Nc4fW3m1w/ZuQZc8q+EVRaZDOHdI4ul9Xr6P4lhUSY/zgzV/C2v/ARf+APkd7ncaEKQoEObEgh7RUu4DTGGsZY7y2bDYs/BuceCMcc5nXaUyYDdv3snrrHusfaYYVEmO8tOlTmPcDGHAyfPfXXqcxDRQFnP6Rkwb18DiJv1khMcYrlRXOdLkdu8OlT0CqDU3uN4WBIN2z0hnWJ9vrKL5mfSTGeKG+Dl64HnZthCmvQedeXicyDagqC0uDjBmYS0qKPdbWFDsjMcYL/7oXAm/DuN9D3iiv05gIvqqoZP32vTYsShSskBjT3la+Cu/fDyMnw6gpzW9vPFEY6h8Za/0jzbJCYkx7Kv8CXrwB+h0P4+/3Oo1pQmEgSK/sDgzq2cnrKL5nhcSY9lK1E2ZdAWkd4LKnnK/Gl1SVokCQsYNyERu+v1nW2W5Me1CFl26EYAAmvwxd87xOZJpQsmU3W3fvs/6RKFkhMaY9fPAAfP4qnHUvFJzidRrTCFUlUL6HxwvXAPb8SLSskBjjtpK34O1fw4hLYMxNXqcxYVSV1Vv3sLC0gqLSIAtLg5Tv2gfAKYN70D8ny+OE8cEKiTFu2rYG5lwHvY+C8/9s0+V6TFVZG6w8UDQWlgbZvNMpHL2yO3DSoFzGDsxlzMBcBuRaEYmWFRJj3FJdCTOvBBQmPg0ZdvdPe1NV1lXspah0q3PWEQiyaWcVAD2zOzBm4P7CkUNBj07Wsd5KVkiMcYMqvHIrbP4MrpgNOQO9TpQ01lWEzjgCzhnHhh1O4ejROYMxobONsYNyGWiFI2askBjjhkV/h0+fh2/fBYPP9DpNQivbVnngbGNhaZD12/cCkNvJKRw3Dsxh7KBcBvXsbIXDJVZIjIm1Nf+BN34BQ89x5l03MbVh+94DRWPh6iDrKpzC0T0rnTEDc5l66kDGDsplcC8rHO3F1UIiImcDfwJSgUdU9b4G6yW0fjxQCVyjqkuj2dcYX9q5AWZfDd3z4cKHIcWe+W2rTTuqnD6OgHNn1VcVlQB0y0rnxIIcrju5gDGDchnSK9sGV/SIa4VERFKBh4AzgTJgsYjMU9UVYZuNAwaH/pwIPAycGOW+xmuqEb5vYln49q1e1sL3POR4scoRYZnWw5wpULMXrvknZCbf1KyqSr1+/bVeFVVQwl7XH/y63tngwOvq2no+KdvOwtIgRYEga4JO4eiSmcaJA3O55qR8xgzMZVgfKxx+4eYZyWigRFVLAURkJjABCC8GE4CnVFWBhSLSTUT6AvlR7HuImg2fsf6eoQct+/qfmYZe6yHrwpc1tV3Ux1A9aF34+oPfq/F1gqJtPsbBuQ/e7tB9aWJdSoT3NJH9PO2nvPdIGc7vQM2LRcs2rJWtPg5fFwI98EFP2Id9WEHg0MIRS9mZaZxYkMuVYwYwZmAuR/btQqoVDl9ys5D0A9aFvS7DOetobpt+Ue4LgIhMBaYCDOmbzYbOIw7ZRuXrj3sA5dB/jNrg4/Pg/SJvpzQ87kHBDloXvv3B/98avtfBH//SWLbw7eTgdYdkkxbkiNgOX7869Gdupo0a+bmcRU28Z3PZoto3Qns083cazXEjrdu/39aMftR2Oo6TD3mXpsXiUr5E+LtojZQUQQRSBFJESAmFc74ntE6Qxl7jfN1/nAOvJfRamngdev/hfbtY4YgjbhaSSP8CGv7O0tg20ezrLFSdAcwAGDVqlJ5w+wstyWiMMaaN3CwkZUD/sNd5wIYot8mIYl9jjDE+4OYtJYuBwSJSICIZwCRgXoNt5gGTxTEG2KGqG6Pc1xhjjA+4dkaiqrUiMg14HecW3sdUdbmI3BBaPx2Yj3PrbwnO7b9TmtrXrazGGGNaTzRWt3v4wKhRo7S4uNjrGMYYEzdEZImqjmrLMexpKWOMMW1ihcQYY0ybWCExxhjTJlZIjDHGtElCdbaLyC5gldc5mtED2Op1iChYztiynLFlOWNnqKpmt+UAiTaM/Kq23n3gNhEp9ntGsJyxZjljy3LGjoi0+VZXu7RljDGmTayQGGOMaZNEKyQzvA4QhXjICJYz1ixnbFnO2GlzxoTqbDfGGNP+Eu2MxBhjTDuzQmKMMaZN4raQhKblnSMin4vIShEZKyI5IvKmiHwZ+trdpzn/W0TWi8jHoT/jPc44NCzLxyKyU0Ru81t7NpHTb+35QxFZLiKfichzIpLpt7ZsIqev2jKU89ZQxuUicltomR/bM1JOz9tTRB4TkS0i8lnYskbbT0R+LiIlIrJKRM6K6j3itY9ERJ4E3lfVR0JzlmQBdwIVqnqfiNwBdFfVn/kw523AblW938tskYhIKrAeZ2rjm/FZe+7XIOcUfNKeItIP+AAYrqp7ReR5nOkShuOjtmwiZz4+aUsAERkBzARGA9XAAuBG4Pv4qz0by3kFHreniJwK7AaeUtURoWW/J0L7ichw4Dmcn+Mw4C1giKrWNfUecXlGIiJdgFOBRwFUtVpVtwMTgCdDmz0JXOBFvv2ayOlnpwMBVV2Lz9qzgfCcfpMGdBSRNJxfHDbgz7aMlNNvjgQWqmqlqtYC7wEX4r/2bCyn51T130BFg8WNtd8EYKaq7lPV1ThzRY1u7j3ispAAA4Fy4HER+UhEHhGRTkDv0AyLhL728jIkjecEmCYiy0KnnZ6floeZhPMbCfivPcOF5wSftKeqrgfuB74CNuLM+vkGPmvLJnKCT9oy5DPgVBHJFZEsnInw+uOz9qTxnOCv9tyvsfbrB6wL264stKxJ8VpI0oCRwMOqehywB7jD20gRNZbzYWAQ8A2c/8R/8CpguNClt/OB2V5naUqEnL5pz9AHxQSgAOfSQCcRudKrPI1pIqdv2hJAVVcCvwPexLlc9AlQ62WmSJrI6av2jIJEWNZs/0e8FpIyoExVF4Vez8H5wN4sIn0BQl+3eJRvv4g5VXWzqtapaj3wD6I4dWwn44Clqro59Npv7bnfQTl91p5nAKtVtVxVa4C5wEn4ry0j5vRZWwKgqo+q6khVPRXnEs2X+K89I+b0Y3uGNNZ+ZXx9JgWQRxSXPOOykKjqJmCdiAwNLTodWAHMA64OLbsaeNmDeAc0lnP/X2DIhTinxX5wOQdfLvJVe4Y5KKfP2vMrYIyIZImI4Pydr8R/bRkxp8/aEgAR6RX6ejhwEc7fvd/aM2JOP7ZnSGPtNw+YJCIdRKQAGAx82OzRVDUu/+CcKhYDy4CXgO5ALvA2zm8sbwM5Ps35NPBpaNk8oK8PcmYBQaBr2DI/tmeknL5qT+Ae4HOcD42ngQ4+bctIOX3VlqGc7+P8ovgJcHpomR/bM1JOz9sTp/BuBGpwzjiua6r9gF8AAZwpOcZF8x5xe/uvMcYYf4jLS1vGGGP8wwqJMcaYNrFCYowxpk2skBhjjGkTKyTGGGPaxAqJMcaYNrFCYowxpk3+H2vSFXVeDp5UAAAAAElFTkSuQmCC", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ax = plt.subplot()\n", "pd.merge(\n", " X,\n", " y,\n", " left_on=\"stay_id\",\n", " right_on=\"stay_id\"\n", ").groupby('IonoC')[\"o2sat\"].plot(kind='kde', ax=ax)\n", "ax.set_xlim(60, 100)\n", "ax.legend()" ] }, { "cell_type": "code", "execution_count": 86, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 86, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ax = plt.subplot()\n", "pd.merge(\n", " X,\n", " y,\n", " left_on=\"stay_id\",\n", " right_on=\"stay_id\"\n", ").groupby('IonoC')[\"sbp\"].plot(kind='kde', ax=ax)\n", "ax.set_xlim(0, 300)\n", "ax.legend()" ] }, { "cell_type": "code", "execution_count": 85, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 85, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ax = plt.subplot()\n", "pd.merge(\n", " X,\n", " y,\n", " left_on=\"stay_id\",\n", " right_on=\"stay_id\"\n", ").groupby('IonoC')[\"dbp\"].plot(kind='kde', ax=ax)\n", "ax.set_xlim(0, 200)\n", "ax.legend()" ] }, { "cell_type": "code", "execution_count": 98, "metadata": {}, "outputs": [], "source": [ "pd.merge(\n", " X,\n", " y,\n", " left_on=\"stay_id\",\n", " right_on=\"stay_id\"\n", ")[[\"IonoC\",\"last_7\"]].fillna(0).groupby('IonoC')[\"last_7\"].plot.bar()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pd.merge(\n", " X,\n", " y,\n", " left_on=\"stay_id\",\n", " right_on=\"stay_id\"\n", ")[[\"IonoC\",\"last_7\"]].fillna(0).groupby('IonoC')[\"last_30\"].plot.bar()" ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ax = plt.subplot()\n", "pd.merge(\n", " X,\n", " y,\n", " left_on=\"stay_id\",\n", " right_on=\"stay_id\"\n", ").groupby('IonoC')[\"pain\"].plot(kind='kde', ax=ax)\n", "ax.set_xlim(0, 20)\n", "ax.legend()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "categorical_features = [\n", " \"gender\",\n", " \"last_7\",\n", " \"last_30\"\n", "]\n", "\n", "continuous_features = [\n", " \"pain\",\n", " \"time\",\n", " \"age\",\n", " \"temperature\",\n", " \"heartrate\",\n", " \"resprate\",\n", " \"o2sat\",\n", " \"sbp\",\n", " \"dbp\"\n", "]+X_train.columns[14:-1].tolist()\n", "\n", "continuous_features = [\n", " \"pain\",\n", " \"time\",\n", " \"age\",\n", " \"temperature\",\n", " \"heartrate\",\n", " \"resprate\",\n", " \"o2sat\",\n", " \"sbp\",\n", " \"dbp\"\n", "]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "features_preprocessing = ColumnTransformer([\n", " (\"binary_encoder\", OrdinalEncoder(), categorical_features),\n", " (\"identity\", StandardScaler(), continuous_features),\n", " (\"missing\", MissingIndicator(), continuous_features),\n", " (\"nlp\", Pipeline([\n", " (\"cv\", CountVectorizer(ngram_range=(1,1), max_features=200)),\n", " (\"tf-idf\", TfidfTransformer())\n", " ]), \"chiefcomplaint\"),\n", "])\n", "\n", "features_preprocessing_without_nlp = ColumnTransformer([\n", " (\"binary_encoder\", OrdinalEncoder(), categorical_features),\n", " (\"identity\", StandardScaler(), continuous_features),\n", " (\"missing\", MissingIndicator(), continuous_features)\n", "])\n", "\n", "full_preprocessing = Pipeline([\n", " (\"features\", features_preprocessing_without_nlp),\n", " (\"imputer\", SimpleImputer(strategy=\"median\"))\n", "])\n", "\n", "pipeline = Pipeline([\n", " (\"preprocessing\", full_preprocessing),\n", " (\"mlp\", MLPClassifier(hidden_layer_sizes=(100,20), verbose=True, learning_rate_init=1e-3, batch_size=64, max_iter=100))\n", "])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "preprocesser = full_preprocessing.fit(X_train, y_train)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from transformers import BertTokenizer, BertModel\n", "import pickle\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bert_name = \"dmis-lab/biobert-v1.1\"\n", "drug_name = \"./models/ATC_2\"\n", "\n", "biobert_tokenizer = BertTokenizer.from_pretrained(bert_name)\n", "\n", "with open(drug_name+\"_encoder.model\", \"rb\") as f:\n", " drug_encoder = pickle.load(f)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#drug_columns = X_train.columns[14:-1]\n", "\n", "#columns_id = drug_encoder.transform(\n", "# np.expand_dims(np.array(X_train[drug_columns].columns), 1)\n", "#).flatten().astype(\"int32\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_drug_token_list (df):\n", "\n", " df_drug_tokens = (df[drug_columns] >= 1)*1\n", " df_drug_tokens = df_drug_tokens.rename(columns=dict(zip(drug_columns, columns_id)))\n", "\n", " df_drug_tokens_list = (df_drug_tokens*(df_drug_tokens.columns+1)).apply(lambda x: list(set(x.tolist()))[1:], axis=1) \\\n", " .tolist()\n", "\n", " df_drug_tokens_list = [torch.tensor(x)-1 for x in df_drug_tokens_list]\n", "\n", " return df_drug_tokens_list" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "X_train_preprocess = torch.tensor(preprocesser.transform(X_train), dtype=torch.float32)\n", "X_train_tokens = biobert_tokenizer(X_train[\"chiefcomplaint\"].tolist())[\"input_ids\"]\n", "#X_train_tokens_drug = get_drug_token_list(X_train)\n", "y_train_preprocess = torch.tensor(y_train.iloc[:,1:].values, dtype=torch.float32)\n", "X_test_preprocess = torch.tensor(preprocesser.transform(X_test), dtype=torch.float32)\n", "X_test_tokens = biobert_tokenizer(X_test[\"chiefcomplaint\"].tolist())[\"input_ids\"]\n", "#X_test_tokens_drug = get_drug_token_list(X_test)\n", "y_test_preprocess = torch.tensor(y_test.iloc[:,1:].values, dtype=torch.float32)\n", "X_train_tokens = [torch.tensor(x) for x in X_train_tokens]\n", "X_test_tokens = [torch.tensor(x) for x in X_test_tokens]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from torch import nn, optim\n", "from torch.nn.utils.rnn import pad_sequence, pack_padded_sequence, pad_packed_sequence\n", "import torch\n", "import operator" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class neural_net (nn.Module):\n", " def __init__(self, n_features, n_outputs, device=\"cpu\"):\n", " super().__init__()\n", "\n", " self.embedding_encoder = nn.Sequential(*[\n", " nn.Linear(768, 250),\n", " nn.ReLU(),\n", " nn.Linear(250, 50),\n", " nn.ReLU(),\n", " nn.Linear(50, 20),\n", " nn.ReLU()\n", " ])\n", "\n", " self.network = nn.Sequential(*[\n", " nn.Linear(n_features+20, 100),\n", " nn.ReLU(),\n", " nn.Linear(100, 50),\n", " nn.ReLU(),\n", " nn.Linear(50, n_outputs),\n", " nn.Sigmoid()\n", " ])\n", "\n", " self.biobert_model = BertModel.from_pretrained(bert_name).to(device)\n", " self.drug_embedding = torch.load(f\"{drug_name}_embedding.model\").to(device) \n", "\n", " for x in self.biobert_model.parameters():\n", " x.requires_grad = False\n", "\n", " #self.drug_embedding.requires_grad = True\n", " #self.biobert_model.embeddings.requires_grad = True\n", " \n", " #self.loss = nn.BCELoss(weight=torch.tensor(y_train.iloc[:,1:].mean().values))\n", " self.loss = nn.BCELoss()\n", " #self.loss = nn.MultiLabelSoftMarginLoss()\n", " #self.loss = nn.CrossEntropyLoss(weight=torch.tensor(y_train.iloc[:,1:].mean().values))\n", " self.optimizer = optim.Adam(self.parameters(), lr=1e-3)\n", "\n", " def forward(self, x):\n", " \n", " x_data = x[0]\n", " x_tokens = x[1]\n", " #x_drugs = x[2]\n", "\n", " x_bert = self.biobert_model.embeddings.word_embeddings(x_tokens)\n", " x_bert_mask = (x_tokens != 0).unsqueeze(2)*1\n", " x_bert = (x_bert*x_bert_mask).sum(axis=1)/x_bert_mask.sum(axis=1)\n", "\n", " #x_drugs_embeddings = self.drug_embedding(x_drugs)\n", " #x_drugs_embeddings_mask = (x_drugs != self.drug_embedding.weight.shape[0]-1).unsqueeze(2)*1\n", " #x_drugs_embeddings_mask = x_drugs_embeddings_mask + 1e-8\n", " #x_drugs_embeddings = (x_drugs_embeddings*x_drugs_embeddings_mask).sum(axis=1)/x_drugs_embeddings_mask.sum(axis=1)\n", "\n", " x_embedding_encoded = self.embedding_encoder(x_bert)\n", " x = torch.concat([x_data, x_embedding_encoded], axis=1)\n", "\n", " y_hat = self.network(x)\n", "\n", " return y_hat\n", " \n", " def fit(self, x, y):\n", " \n", " self.train()\n", " self.optimizer.zero_grad()\n", "\n", " y_hat = self.forward(x)\n", "\n", " loss = self.loss(y_hat, y)\n", "\n", " loss.backward()\n", " self.optimizer.step()\n", "\n", " return loss\n", "\n", " def last_hidden_layer (self, x):\n", " \n", " self.eval()\n", " \n", " with torch.no_grad(): \n", " x_data = x[0]\n", " x_tokens = x[1]\n", " #x_drugs = x[2]\n", "\n", " x_bert = self.biobert_model.embeddings.word_embeddings(x_tokens)\n", " x_bert_mask = (x_tokens != 0).unsqueeze(2)*1\n", " x_bert = (x_bert*x_bert_mask).sum(axis=1)/x_bert_mask.sum(axis=1)\n", "\n", " x_embedding_encoded = self.embedding_encoder(x_bert)\n", " x = torch.concat([x_data, x_embedding_encoded], axis=1)\n", "\n", " y_hat = network.network[:-3](x)\n", "\n", " return y_hat\n", "\n", " def predict(self, x):\n", " \n", " self.eval()\n", " \n", " with torch.no_grad(): \n", " y_hat = self.forward(x)\n", "\n", " return y_hat" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "device = \"cuda:0\"\n", "#device = \"cpu\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#network = neural_net(X_train_preprocess.shape[1], y_train_preprocess.shape[1], device=device)\n", "network = neural_net(X_train_preprocess.shape[1], 1, device=device)\n", "network = network.to(device)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from torch.utils.data import DataLoader\n", "from torchvision.ops.focal_loss import sigmoid_focal_loss\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_loader = DataLoader(range(X_train_preprocess.shape[0]), shuffle=True, batch_size=1024)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.691106379032135\n", "On test : precision = 0.5, recall = 0.00028876696505919725\n", "0.6006030204272507\n", "On test : precision = 0.7114660697934683, recall = 0.6731157955529887\n", "0.575636284564858\n", "On test : precision = 0.7091583562171797, recall = 0.7084416209452306\n", "0.5641533084882058\n", "On test : precision = 0.7105074026160701, recall = 0.7136875541438059\n", "Epoch 0 - loss : 0.5566418178473846\n", "0.5154001116752625\n", "On test : precision = 0.7277034414536367, recall = 0.6726826451054\n", "0.5331675112247467\n", "On test : precision = 0.7340632731483918, recall = 0.6711425546250842\n", "0.5320046168951252\n", "On test : precision = 0.7303906490310673, recall = 0.6856771585330638\n", "0.5311751908242108\n", "On test : precision = 0.7530828460603254, recall = 0.6260467802483396\n", "Epoch 1 - loss : 0.5314786310437359\n", "0.5191752910614014\n", "On test : precision = 0.7150190114068441, recall = 0.7240350370584272\n", "0.5276658352058713\n", "On test : precision = 0.7223930122480847, recall = 0.7124843584560593\n", "0.5285998987617777\n", "On test : precision = 0.7104558877814692, recall = 0.74102416016941\n", "0.5277084383457602\n", "On test : precision = 0.7457774871962515, recall = 0.6587737029550486\n", "Epoch 2 - loss : 0.5267519308796412\n", "0.5291849374771118\n", "On test : precision = 0.7194110920526015, recall = 0.7266820675714698\n", "0.5206360940886016\n", "On test : precision = 0.7333908980089394, recall = 0.6949177014149581\n", "0.5217355852992973\n", "On test : precision = 0.7369723210143807, recall = 0.688131677736067\n", "0.521929228622652\n", "On test : precision = 0.7239691845535152, recall = 0.7191259986524209\n", "Epoch 3 - loss : 0.5215651713594606\n", "0.5235981941223145\n", "On test : precision = 0.7369445013311489, recall = 0.6927519491770141\n", "0.5186061596516336\n", "On test : precision = 0.7351477607393114, recall = 0.696794686687843\n", "0.5180819096849926\n", "On test : precision = 0.7319912079128784, recall = 0.7052170565020695\n", "0.5182426832245037\n", "On test : precision = 0.7510562235944102, recall = 0.6673404562518048\n", "Epoch 4 - loss : 0.5190195072300826\n", "0.5245155096054077\n", "On test : precision = 0.7069435141772717, recall = 0.7619597651362018\n", "0.5169979248306539\n", "On test : precision = 0.7158938869665513, recall = 0.7467994994705939\n", "0.5154471167580998\n", "On test : precision = 0.7308615369544987, recall = 0.7181634421022235\n", "0.5157833989474464\n", "On test : precision = 0.7234719058466211, recall = 0.7337087303879103\n", "Epoch 5 - loss : 0.515931355500523\n", "0.4985155463218689\n", "On test : precision = 0.7207000093835038, recall = 0.7392915583790548\n", "0.5164604614866842\n", "On test : precision = 0.7275522459764593, recall = 0.7288478198094138\n", "0.5142399074129798\n", "On test : precision = 0.7449022055763629, recall = 0.6891904899412841\n", "0.5145066314163398\n", "On test : precision = 0.7475281552371384, recall = 0.6804312253344884\n", "Epoch 6 - loss : 0.5147569981556904\n", "0.49890613555908203\n", "On test : precision = 0.7430380398414578, recall = 0.6947251901049186\n", "0.5157188932494362\n", "On test : precision = 0.7427693882774508, recall = 0.6983347771681586\n", "0.5156954976160135\n", "On test : precision = 0.7385383567862006, recall = 0.7047357782269709\n", "0.515681599164722\n", "On test : precision = 0.74251832247557, recall = 0.7021368755414381\n", "Epoch 7 - loss : 0.5150245293786254\n", "0.49726372957229614\n", "On test : precision = 0.7318327350344426, recall = 0.7260564058138416\n", "0.513713940240369\n", "On test : precision = 0.7359238699444886, recall = 0.7146019828664935\n", "0.5135435411586097\n", "On test : precision = 0.7411841972314843, recall = 0.706083357397247\n", "0.5129304822299171\n", "On test : precision = 0.7369464967521198, recall = 0.7152757724516315\n", "Epoch 8 - loss : 0.5133467673500882\n", "0.5323821902275085\n", "On test : precision = 0.7329362366956352, recall = 0.7224949465781114\n", "0.510783731052191\n", "On test : precision = 0.7271386430678466, recall = 0.7355375878332852\n", "0.5100371665622464\n", "On test : precision = 0.7245657568238213, recall = 0.7448262585426894\n", "0.5103289775080063\n", "On test : precision = 0.7371029287873532, recall = 0.7158533063817499\n", "Epoch 9 - loss : 0.5115820087964021\n", "0.5278018712997437\n", "On test : precision = 0.7349821594408329, recall = 0.7236981422658582\n", "0.509880670521519\n", "On test : precision = 0.7158084914182475, recall = 0.7627298103763596\n", "0.5081589534804596\n", "On test : precision = 0.7391979949874686, recall = 0.7097410722879969\n", "0.5089700956677281\n", "On test : precision = 0.7339721762817395, recall = 0.7262007892963711\n", "Epoch 10 - loss : 0.5090479979032203\n", "0.519418478012085\n", "On test : precision = 0.7202764976958526, recall = 0.7522379439792087\n", "0.5076188123462224\n", "On test : precision = 0.7349367960257734, recall = 0.7191259986524209\n", "0.5096459284943727\n", "On test : precision = 0.7378349093774625, recall = 0.7210029839253056\n", "0.5096920878190139\n", "On test : precision = 0.7328155339805825, recall = 0.7265376840889403\n", "Epoch 11 - loss : 0.5086245969126496\n", "0.5080099105834961\n", "On test : precision = 0.7295591469828823, recall = 0.7343343921455385\n", "0.5070724266000314\n", "On test : precision = 0.7358842285994603, recall = 0.721965540475503\n", "0.5059742887518299\n", "On test : precision = 0.7328332930075006, recall = 0.7288478198094138\n", "0.5064577097908602\n", "On test : precision = 0.7308452673706544, recall = 0.7294734815670421\n", "Epoch 12 - loss : 0.5078333315970023\n", "0.49358904361724854\n", "On test : precision = 0.7290276453765491, recall = 0.7361151217634035\n", "0.5067540569470661\n", "On test : precision = 0.7323598635861472, recall = 0.73380498604293\n", "0.5065122897648693\n", "On test : precision = 0.7542148234545647, recall = 0.6846664741553566\n", "0.5084922199827492\n", "On test : precision = 0.7233796296296297, recall = 0.7519973048416595\n", "Epoch 13 - loss : 0.5075114444086823\n", "0.49666109681129456\n", "On test : precision = 0.7252100840336134, recall = 0.7476176725382616\n", "0.5051100498969012\n", "On test : precision = 0.7369399334768147, recall = 0.7250938492636443\n", "0.5057181871649045\n", "On test : precision = 0.7466612294831277, recall = 0.7049764173645202\n", "0.5072105418011992\n", "On test : precision = 0.7283676553939165, recall = 0.7421792280296468\n", "Epoch 14 - loss : 0.5062240871447551\n", "0.5191906690597534\n", "On test : precision = 0.7395544554455445, recall = 0.7189816151698912\n", "0.504310804723513\n", "On test : precision = 0.7552971164569061, recall = 0.684522090672827\n", "0.5053237096883765\n", "On test : precision = 0.7340909090909091, recall = 0.7306285494272788\n", "0.5062640938053892\n", "On test : precision = 0.736219124480313, recall = 0.7244200596785061\n", "Epoch 15 - loss : 0.5053795011737678\n", "0.4989502429962158\n", "On test : precision = 0.7325291977790542, recall = 0.7365482722109924\n", "0.5064039799836603\n", "On test : precision = 0.7364736765349278, recall = 0.729136586774473\n", "0.505478177497636\n", "On test : precision = 0.7345682005527808, recall = 0.7290884589469632\n", "0.5045830322262457\n", "On test : precision = 0.7247833943381365, recall = 0.7528636057368371\n", "Epoch 16 - loss : 0.5043610495102556\n", "0.4920410215854645\n", "On test : precision = 0.7243070658460969, recall = 0.7533448840119357\n", "0.5003653608336307\n", "On test : precision = 0.7195271096616388, recall = 0.7645105399942247\n", "0.5010813141047065\n", "On test : precision = 0.7379865508270751, recall = 0.7236018866108384\n", "0.5018343973991483\n", "On test : precision = 0.7353953556018811, recall = 0.7300510154971604\n", "Epoch 17 - loss : 0.5035921111137052\n", "0.4888218641281128\n", "On test : precision = 0.7268319970165952, recall = 0.7504090865338339\n", "0.5015489895745079\n", "On test : precision = 0.7332950136513867, recall = 0.7367889113485417\n", "0.5033437172275278\n", "On test : precision = 0.7282874905802562, recall = 0.7442005967850611\n", "0.5026719430554348\n", "On test : precision = 0.7377786468517794, recall = 0.7263451727789008\n", "Epoch 18 - loss : 0.5028010086168216\n", "0.5066063404083252\n", "On test : precision = 0.7345524542829643, recall = 0.7346231591105977\n", "0.5018388912229255\n", "On test : precision = 0.719209142027934, recall = 0.763307344306478\n", "0.5002150563754846\n", "On test : precision = 0.7373129950132004, recall = 0.7258157666762922\n", "0.5022425656500845\n", "On test : precision = 0.7278510838831291, recall = 0.7433342958898835\n", "Epoch 19 - loss : 0.5019486363175549\n", "0.47926339507102966\n", "On test : precision = 0.7323501427212179, recall = 0.7408797766868803\n", "0.49971008772897246\n", "On test : precision = 0.7300813777441333, recall = 0.7426605063047454\n", "0.5002152393409862\n", "On test : precision = 0.7273622970817303, recall = 0.7461257098854558\n", "0.5013288241684238\n", "On test : precision = 0.7416225313072315, recall = 0.7210992395803253\n", "Epoch 20 - loss : 0.5008858129193511\n", "0.5151246786117554\n", "On test : precision = 0.735239852398524, recall = 0.728799691981904\n", "0.4983244101599892\n", "On test : precision = 0.7248408818374689, recall = 0.7563769371450573\n", "0.4970505050758817\n", "On test : precision = 0.730860225229488, recall = 0.7433824237173934\n", "0.49717872661609586\n", "On test : precision = 0.7449337881219904, recall = 0.714746366349023\n", "Epoch 21 - loss : 0.49780065651181377\n", "0.48739099502563477\n", "On test : precision = 0.7354018115243784, recall = 0.7346231591105977\n", "0.49334852795789735\n", "On test : precision = 0.7303307264675398, recall = 0.7460775820579459\n", "0.49463323218312427\n", "On test : precision = 0.7429808841099164, recall = 0.7183078255847531\n", "0.49579841146041387\n", "On test : precision = 0.7449392712550608, recall = 0.7084416209452306\n", "Epoch 22 - loss : 0.4965084141568293\n", "0.4926077127456665\n", "On test : precision = 0.7217624590760277, recall = 0.7639330060641063\n", "0.49502673007474085\n", "On test : precision = 0.7273743537193162, recall = 0.7515641543940706\n", "0.4956098737111732\n", "On test : precision = 0.7344678811121764, recall = 0.7373664452786601\n", "0.4958594325570965\n", "On test : precision = 0.7506115014311736, recall = 0.6941476561748002\n", "Epoch 23 - loss : 0.495923723374741\n", "0.47399887442588806\n", "On test : precision = 0.7381651017214398, recall = 0.7264414284339205\n", "0.4940861616984452\n", "On test : precision = 0.7452663468821005, recall = 0.7103667340456252\n", "0.49426327445613805\n", "On test : precision = 0.7333174224343676, recall = 0.7393878140340745\n", "0.4953075092892314\n", "On test : precision = 0.7655076495132128, recall = 0.662238906535759\n", "Epoch 24 - loss : 0.4953661680221558\n", "0.5228779315948486\n", "On test : precision = 0.7442258340461934, recall = 0.7118105688709212\n", "0.4923725653402876\n", "On test : precision = 0.7376028030561098, recall = 0.7294734815670421\n", "0.4930079595663061\n", "On test : precision = 0.7340628882729618, recall = 0.7392915583790548\n", "0.49445695774103715\n", "On test : precision = 0.7341409901745684, recall = 0.7407835210318606\n", "Epoch 25 - loss : 0.49466084581387193\n", "0.46513304114341736\n", "On test : precision = 0.7360903344110409, recall = 0.7341418808354991\n", "0.49388582281547017\n", "On test : precision = 0.7195767195767195, recall = 0.765809991336991\n", "0.49362160183897064\n", "On test : precision = 0.7461990675045611, recall = 0.7086341322552699\n", "0.49338837715478434\n", "On test : precision = 0.7481032639136412, recall = 0.7070940417749543\n", "Epoch 26 - loss : 0.49349285349061217\n", "0.4880646765232086\n", "On test : precision = 0.7410621825533803, recall = 0.7232649918182693\n", "0.49176528194163105\n", "On test : precision = 0.7345693837360103, recall = 0.7360188661083839\n", "0.49261100775566863\n", "On test : precision = 0.7456571944974395, recall = 0.7147944941765328\n", "0.49288865616947314\n", "On test : precision = 0.7505957931820537, recall = 0.6972759649629415\n", "Epoch 27 - loss : 0.4931577115873747\n", "0.47572997212409973\n", "On test : precision = 0.7506438652518801, recall = 0.7013668303012802\n", "0.49050306595198\n", "On test : precision = 0.7380428941325907, recall = 0.7270670901915488\n", "0.4920397731498699\n", "On test : precision = 0.7261525681849647, recall = 0.7572913658677447\n", "0.492620616260161\n", "On test : precision = 0.7270705947748749, recall = 0.75541438059486\n", "Epoch 28 - loss : 0.4932937731471243\n", "0.5012404918670654\n", "On test : precision = 0.7346705417123255, recall = 0.7421311002021369\n", "0.4935308385013354\n", "On test : precision = 0.7320115733055068, recall = 0.7427567619597651\n", "0.4925164138499777\n", "On test : precision = 0.7216824049997719, recall = 0.7613822312060834\n", "0.4938466421195439\n", "On test : precision = 0.750325436084353, recall = 0.693521994417172\n", "Epoch 29 - loss : 0.4946022235894505\n", "0.4953339099884033\n", "On test : precision = 0.7297094305966296, recall = 0.7481470786408702\n", "0.49041805763055785\n", "On test : precision = 0.7502446561936646, recall = 0.7010299355087112\n", "0.4931423207420615\n", "On test : precision = 0.7352813956841447, recall = 0.736307633073443\n", "0.4925843425763406\n", "On test : precision = 0.7460094368035338, recall = 0.7152757724516315\n", "Epoch 30 - loss : 0.4933103129833559\n", "0.5755341053009033\n", "On test : precision = 0.7485953621411788, recall = 0.7053614399845991\n", "0.4932693898087681\n", "On test : precision = 0.7366245136186771, recall = 0.7288959476369237\n", "0.49100364855865936\n", "On test : precision = 0.7430106457068948, recall = 0.7188372316873616\n", "0.49268141141365535\n", "On test : precision = 0.7314221048660883, recall = 0.7465588603330445\n", "Epoch 31 - loss : 0.49313025632991064\n", "0.46600431203842163\n", "On test : precision = 0.7224534601838722, recall = 0.7601790355183367\n", "0.4903270124208809\n", "On test : precision = 0.7441521203279843, recall = 0.7119549523534507\n", "0.49055656909349543\n", "On test : precision = 0.7316389548693587, recall = 0.7412166714794494\n", "0.492653287625392\n", "On test : precision = 0.7404246971703202, recall = 0.72663393974396\n", "Epoch 32 - loss : 0.4919591864453086\n", "0.4467436969280243\n", "On test : precision = 0.7414435618319114, recall = 0.7277408797766869\n", "0.4866951137486071\n", "On test : precision = 0.7230283334096215, recall = 0.7602271633458466\n", "0.4884573491058539\n", "On test : precision = 0.735220368405825, recall = 0.73380498604293\n", "0.49062324639570676\n", "On test : precision = 0.7195598472939592, recall = 0.7710559245355665\n", "Epoch 33 - loss : 0.49098759293556216\n", "0.4772811532020569\n", "On test : precision = 0.7395291809710642, recall = 0.7257195110212725\n", "0.4887826596156205\n", "On test : precision = 0.7256522142790828, recall = 0.7523341996342285\n", "0.49171305325493886\n", "On test : precision = 0.728898459809934, recall = 0.7493502743286168\n", "0.49049466204801667\n", "On test : precision = 0.7414505450744189, recall = 0.7168639907594572\n", "Epoch 34 - loss : 0.49134464648705495\n", "0.48381781578063965\n", "On test : precision = 0.7229911835914302, recall = 0.7617191259986524\n", "0.49350475320721615\n", "On test : precision = 0.7532263963634463, recall = 0.6938107613822312\n", "0.4918556806459949\n", "On test : precision = 0.7408834865105247, recall = 0.7216286456829338\n", "0.4905527135066416\n", "On test : precision = 0.735855421686747, recall = 0.7348637982481471\n", "Epoch 35 - loss : 0.49083545019355\n", "0.50368332862854\n", "On test : precision = 0.7351413733410271, recall = 0.7357782269708345\n", "0.490445864082563\n", "On test : precision = 0.7325553703262682, recall = 0.7402059871017422\n", "0.48895341009642945\n", "On test : precision = 0.742154915590864, recall = 0.7193666377899701\n", "0.4895016640127695\n", "On test : precision = 0.7154373775164796, recall = 0.7730772932909808\n", "Epoch 36 - loss : 0.49018767596800117\n", "0.4943732023239136\n", "On test : precision = 0.741195092995647, recall = 0.7211473674078352\n", "0.48728429179380434\n", "On test : precision = 0.7321445599313435, recall = 0.7390509192415055\n", "0.4889129802065702\n", "On test : precision = 0.7385675200312684, recall = 0.7275483684666474\n", "0.4895449759952254\n", "On test : precision = 0.7478923311325546, recall = 0.7087303879102897\n", "Epoch 37 - loss : 0.4894779199286352\n", "0.5134391784667969\n", "On test : precision = 0.7131474103585658, recall = 0.7753393011839446\n", "0.48750126656919424\n", "On test : precision = 0.7298979687353739, recall = 0.7505534700163634\n", "0.4876820467301269\n", "On test : precision = 0.7359180335754486, recall = 0.736307633073443\n", "0.48870848401440337\n", "On test : precision = 0.7342398616182971, recall = 0.7354413321782655\n", "Epoch 38 - loss : 0.4888177595561064\n", "0.5162264704704285\n", "On test : precision = 0.7294762998734118, recall = 0.7488208682260082\n", "0.4837004595463819\n", "On test : precision = 0.7297081355454247, recall = 0.7472326499181827\n", "0.4879101812839508\n", "On test : precision = 0.7376440711076382, recall = 0.7269227067090192\n", "0.4885243367515133\n", "On test : precision = 0.7336136059621632, recall = 0.7390509192415055\n", "Epoch 39 - loss : 0.48852996003778676\n", "0.4767839014530182\n", "On test : precision = 0.7173160756819918, recall = 0.7681682548849745\n", "0.48714618989736724\n", "On test : precision = 0.7371397922863133, recall = 0.7275964962941572\n", "0.48659720883440616\n", "On test : precision = 0.7303243472923847, recall = 0.7444893637501203\n", "0.4855928495278786\n", "On test : precision = 0.7257344737820751, recall = 0.7513716430840311\n", "Epoch 40 - loss : 0.4853414050385922\n", "0.4991248846054077\n", "On test : precision = 0.7392603477834925, recall = 0.7263451727789008\n", "0.4837396457643792\n", "On test : precision = 0.7260178471834914, recall = 0.75180479353162\n", "0.4851182410076483\n", "On test : precision = 0.7457893668831169, recall = 0.7075271922225431\n", "0.4848854331875164\n", "On test : precision = 0.7241426737575647, recall = 0.7601790355183367\n", "Epoch 41 - loss : 0.48461107302315626\n", "0.4812166690826416\n", "On test : precision = 0.7516588652847076, recall = 0.7032919434016749\n", "0.4824957245647317\n", "On test : precision = 0.7348794687966126, recall = 0.7350563095581866\n", "0.48231442339384734\n", "On test : precision = 0.7364314789687924, recall = 0.7313985946674367\n", "0.48301191365600027\n", "On test : precision = 0.7260126936718313, recall = 0.7487246125709885\n", "Epoch 42 - loss : 0.48370890670184846\n", "0.4642280340194702\n", "On test : precision = 0.7284904688304997, recall = 0.7485802290884589\n", "0.4810808053111086\n", "On test : precision = 0.7414962618210625, recall = 0.7207623447877562\n", "0.4822801913491529\n", "On test : precision = 0.7300451807228916, recall = 0.7465588603330445\n", "0.4838001663700687\n", "On test : precision = 0.7426664674535585, recall = 0.7176821638271248\n", "Epoch 43 - loss : 0.4833500043500828\n", "0.48344671726226807\n", "On test : precision = 0.7282776710852967, recall = 0.7482914621233997\n", "0.47828448113828603\n", "On test : precision = 0.7443261699136373, recall = 0.7134469150062566\n", "0.47993328796690377\n", "On test : precision = 0.7382389011528084, recall = 0.7242756761959765\n", "0.482173868984083\n", "On test : precision = 0.7241951849460382, recall = 0.7557031475599192\n", "Epoch 44 - loss : 0.48240686473967154\n", "0.4691382050514221\n", "On test : precision = 0.7398253662868137, recall = 0.7217730291654635\n", "0.4799038248487038\n", "On test : precision = 0.7180549302116164, recall = 0.7675425931273462\n", "0.48127377181503905\n", "On test : precision = 0.7440377566902646, recall = 0.7132062758687073\n", "0.48116818308038173\n", "On test : precision = 0.7240358827337464, recall = 0.753585523149485\n", "Epoch 45 - loss : 0.4820579810233056\n", "0.4957138001918793\n", "On test : precision = 0.7318316030098104, recall = 0.739580325344114\n", "0.4815530251748491\n", "On test : precision = 0.7397954165437198, recall = 0.7239869092309174\n", "0.4814220647610242\n", "On test : precision = 0.7348794687966126, recall = 0.7350563095581866\n", "0.4813806482921803\n", "On test : precision = 0.7248872111223644, recall = 0.7578207719703532\n", "Epoch 46 - loss : 0.48105589106113095\n", "0.4731432795524597\n", "On test : precision = 0.7445958763915931, recall = 0.7178265473096545\n", "0.47872081457978427\n", "On test : precision = 0.7366201279813845, recall = 0.7313023390124169\n", "0.4795397832915558\n", "On test : precision = 0.72129574633344, recall = 0.7597940128982578\n", "0.4801403841505019\n", "On test : precision = 0.7380836100800938, recall = 0.727355857156608\n", "Epoch 47 - loss : 0.48055834536310993\n", "0.47339731454849243\n", "On test : precision = 0.7346831070383342, recall = 0.7369814226585812\n", "0.4796473847167327\n", "On test : precision = 0.738722447583207, recall = 0.7274521128116277\n", "0.4792541035668767\n", "On test : precision = 0.735829812035553, recall = 0.729136586774473\n", "0.4792395104403512\n", "On test : precision = 0.7345323741007195, recall = 0.7370776783136009\n", "Epoch 48 - loss : 0.479706547079207\n", "0.45964089035987854\n", "On test : precision = 0.7396514161220044, recall = 0.7189334873423814\n", "0.4777260641060253\n", "On test : precision = 0.7351035016444186, recall = 0.7314948503224564\n", "0.4780597803901084\n", "On test : precision = 0.7253739637845598, recall = 0.7538261622870344\n", "0.4789733001560072\n", "On test : precision = 0.7376690339527191, recall = 0.729858504187121\n", "Epoch 49 - loss : 0.4793024491660203\n" ] } ], "source": [ "n_epochs = 50\n", "n_epoch_print = 1\n", "n_batch_print = 100\n", "\n", "for i in range(n_epochs):\n", "\n", " losses = []\n", "\n", " j = 0\n", " for indices in data_loader:\n", " X_tensor = X_train_preprocess[indices,:].to(device)\n", " X_train_tokens_indices = list(operator.itemgetter(*indices)(X_train_tokens))\n", " X_train_tokens_indices = pad_sequence(X_train_tokens_indices, batch_first=True, padding_value=biobert_tokenizer(\"[PAD]\")[\"input_ids\"][1]).to(device)\n", " #X_train_drug_tokens_indices = list(operator.itemgetter(*indices)(X_train_tokens_drug))\n", " #X_train_drug_tokens_indices = pad_sequence(X_train_drug_tokens_indices, batch_first=True, padding_value=len(drug_encoder.categories_[0])).to(device)\n", " #X_train_drug_tokens_indices = X_train_drug_tokens_indices.int()\n", "\n", " y_tensor = y_train_preprocess[indices,5].unsqueeze(1).to(device)\n", " #y_tensor = y_train_preprocess[indices,:].to(device)\n", "\n", " loss = network.fit((X_tensor, X_train_tokens_indices), y_tensor).detach().cpu().item()\n", "\n", " losses.append(loss)\n", "\n", " if j%n_batch_print == 0:\n", " print(np.array(losses).mean())\n", "\n", " y_test_hat, y_true = get_test_evaluation(X_test_preprocess, X_test_tokens)\n", " prec = precision_score(y_true, y_test_hat, zero_division=0)\n", " rec = recall_score(y_true, y_test_hat, zero_division=0)\n", " \n", " print(f\"On test : precision = {prec}, recall = {rec}\")\n", "\n", " j += 1\n", "\n", " if (i%n_epoch_print) == 0:\n", " mean_loss = np.array(losses).mean()\n", " print(f\"Epoch {i} - loss : {mean_loss}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Analyse des faux positifs" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y_hats = []\n", "y_val = []\n", "\n", "data_loader_noshuffle = DataLoader(range(X_train_preprocess.shape[0]), shuffle=False, batch_size=1024)\n", "\n", "for indices in data_loader_noshuffle:\n", " X_tensor = X_train_preprocess[indices,:].to(device)\n", " X_train_tokens_indices = list(operator.itemgetter(*indices)(X_train_tokens))\n", " X_train_tokens_indices = pad_sequence(X_train_tokens_indices, batch_first=True, padding_value=biobert_tokenizer(\"[PAD]\")[\"input_ids\"][1]).to(device)\n", "\n", " y_hats.append(\n", " network.predict((X_tensor, X_train_tokens_indices)).detach().cpu()\n", " )\n", " y_val.append(y_train_preprocess[indices,5])\n", "\n", "y_hat = np.concatenate(y_hats)\n", "y_val = np.concatenate(y_val)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fp = (y_hat >= 0.65)*1 & ((y_val == 0)*1).reshape(-1, 1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "416252 UNRESPPONSIVE\n", "441947 Abd pain, Abnormal CT, Transfer\n", "189529 Chest pain, Palpitations, ILI\n", "240265 Weakness, Transfer\n", "291786 Hallucinations\n", " ... \n", "252801 s/p Fall\n", "239629 Abd pain\n", "278167 Hyperglycemia, Weakness\n", "137337 Wound eval\n", "131932 s/p Fall, R Hip pain\n", "Name: chiefcomplaint, Length: 24690, dtype: string" ] }, "execution_count": 206, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_train.iloc[np.where(fp == 1)[0],:][\"chiefcomplaint\"]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "from sklearn.metrics import roc_curve, roc_auc_score\n", "\n", "fpr, tpr, thresholds = roc_curve(\n", " y_val,\n", " y_hat\n", ")\n", "\n", "auc = roc_auc_score(y_val, y_hat)\n", "\n", "plt.figure(figsize=(10,10))\n", "\n", "plt.plot(fpr,tpr,label=\"AUC=\"+str(auc))\n", "plt.ylabel('True Positive Rate')\n", "plt.xlabel('False Positive Rate')\n", "plt.legend(loc=4)\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "batch_size = 256\n", "\n", "def get_test_evaluation(X_test_preprocess, X_test_tokens):\n", " y_hats = []\n", "\n", " for idx in range(0, X_test_preprocess.shape[0], batch_size):\n", " X_test_tensor = X_test_preprocess[idx:idx+batch_size].to(device)\n", " X_test_tokens_indices = X_test_tokens[idx:idx+batch_size]\n", " X_test_tokens_indices = pad_sequence(X_test_tokens_indices, batch_first=True, padding_value=biobert_tokenizer(\"[PAD]\")[\"input_ids\"][1]).to(device)\n", " #X_test_drug_tokens_indices = X_test_tokens_drug[idx:idx+batch_size]\n", " #X_test_drug_tokens_indices = pad_sequence(X_test_drug_tokens_indices, batch_first=True, padding_value=len(drug_encoder.categories_[0])).to(device)\n", " #X_test_drug_tokens_indices = X_test_drug_tokens_indices.int()\n", "\n", " y_hat_ = ((network.predict((X_test_tensor, X_test_tokens_indices)).detach().cpu()) >= 0.65)*1\n", "\n", " y_hats.append(y_hat_)\n", "\n", " y_hat = torch.concat(y_hats, axis=0).numpy()\n", " y_true = y_test_preprocess[:,[5]]\n", " #y_true = y_test_preprocess[:,:]\n", "\n", " return y_hat, y_true\n", "\n", "y_hat, y_true = get_test_evaluation(X_test_preprocess, X_test_tokens)\n", "titles = y_train.columns.tolist()[1:]\n", "titles = y_train.columns[[6]].tolist()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "n_cols = 3\n", "n_rows = len(titles) // n_cols + int((len(titles)%n_cols) != 0)\n", "\n", "figs, axs = plt.subplots(nrows=n_rows, ncols=n_cols, figsize=(30,30))\n", "axs_flatten = axs.flatten()\n", "\n", "for i in range(len(titles)):\n", " cm_plot = ConfusionMatrixDisplay(confusion_matrix(y_true[:,i], y_hat[:,i]))\n", " cm_plot.plot(ax=axs_flatten[i])\n", " axs_flatten[i].set_title(titles[i])\n", "\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
precisionrecallf1_score
IonoC0.80.580.67
\n", "
" ], "text/plain": [ " precision recall f1_score\n", "IonoC 0.8 0.58 0.67" ] }, "execution_count": 210, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame.from_dict(\n", " dict(zip(titles, np.concatenate([\n", " precision_score(y_true, y_hat, average=\"binary\", zero_division=0).reshape(1,-1),\n", " recall_score(y_true, y_hat, average=\"binary\", zero_division=0).reshape(1,-1),\n", " f1_score(y_true, y_hat, average=\"binary\").reshape(1,-1)\n", " ], axis=0).T)),\n", " orient=\"index\",\n", " columns=[\"precision\",\"recall\",\"f1_score\"]\n", ").round(2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "interpreter": { "hash": "28b293e0c0671e44c7281dde6399c7c7419d3faca031d22494da8635907ada72" }, "kernelspec": { "display_name": "Python 3.9.7 ('base')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }