{ "cells": [ { "cell_type": "markdown", "id": "e4f267ab", "metadata": {}, "source": [ "# Laboratory Session 1: Machine Learning Landscape\n", "\n", "## Introducción\n", "\n", "Para la presente sesión (Machine Learning Landscape) el alumno explorará el uso del entorno Jupyter para el análisis de datos y programación enfocada a la ciencia de datos. \n", "\n", "En esta sesión aprenderá los conceptos básicos de los módulos `Numpy`, `Pandas`, `Matplotlib` y `Sklearn` a través del desarrollo de **un modelo lineal para determinar el nivel de satisfacción de vida a partir del parámetro GDP**. En consecuencia, el modelo desarrollado utiliza como datos de entrenamiento información recolectada por la OECD y GDP. Los parámetros obtenidos después del entrenamiento son usados para evaluar por inspección visual el comportamiento de los datos contra los datos reales. Por lo tanto, para la presente sección se pueden identificar los siguientes objetivos.\n", "\n", "Objetivo General: *Desarrollar un modelo basado en la regresión lineal usando las funciones `fit` del módulo Sklearn para determinar el nivel de satisfacción de vida en función de GDP de cada país.*\n", "\n", "Objetivos Específicos:\n", "\n", "- Aprender el arranque y manejo básico de la plataforma Jupyter y los comandos de inserción, copia, borrado y evaluación de celdas\n", "- Identificar el procedimiento de instalación de módulos nuevos en Python usando el comando **pip3​** \n", "- Conocer el procedimiento de lectura de archivos CSV\n", "- Relacionarse con el procedimiento de acceso a datos por nombres de columnas, manipulación, edición y filtrado de datos con la estructura de Pandas\n", "- Crear una función en Python para leer las bases de datos de la OECD y GDP, y posteriormente crear una nueva base de datos que exclusivamente incluyan \"Life Satisfaction\" y \"GDP per capita\"\n", "- Entrenar un modelo basado en la regresión lineal \n", "- Comparar el modelo con los datos reales y determinar nuevos valores de instancias" ] }, { "cell_type": "markdown", "id": "53912466", "metadata": {}, "source": [ "# The Data: Life Satisfaction and GDP per capita\n", "The next sections will explorer the two datasets: 1) Organisation for Economic Co-operation and Development and 2)International Monetary Fund.\n", "\n", "## Life satisfaction data description\n", "\n", "This dataset was obtained from the OECD's website at: http://stats.oecd.org/index.aspx?DataSetCode=BLI\n", "\n", "```\n", "Int64Index: 3292 entries, 0 to 3291\n", "Data columns (total 17 columns):\n", "\"LOCATION\" 3292 non-null object\n", "Country 3292 non-null object\n", "INDICATOR 3292 non-null object\n", "Indicator 3292 non-null object\n", "MEASURE 3292 non-null object\n", "Measure 3292 non-null object\n", "INEQUALITY 3292 non-null object\n", "Inequality 3292 non-null object\n", "Unit Code 3292 non-null object\n", "Unit 3292 non-null object\n", "PowerCode Code 3292 non-null int64\n", "PowerCode 3292 non-null object\n", "Reference Period Code 0 non-null float64\n", "Reference Period 0 non-null float64\n", "Value 3292 non-null float64\n", "Flag Codes 1120 non-null object\n", "Flags 1120 non-null object\n", "dtypes: float64(3), int64(1), object(13)\n", "memory usage: 462.9+ KB\n", "```\n" ] }, { "cell_type": "markdown", "id": "72915229", "metadata": {}, "source": [ "### Example using Python Pandas\n", "\n", "```\n", ">>> life_sat = pd.read_csv(\"oecd_bli_2015.csv\", thousands=',')\n", "\n", ">>> life_sat_total = life_sat[life_sat[\"INEQUALITY\"]==\"TOT\"]\n", "\n", ">>> life_sat_total = life_sat_total.pivot(index=\"Country\", columns=\"Indicator\", values=\"Value\")\n", "\n", ">>> life_sat_total.info()\n", "\n", "Index: 37 entries, Australia to United States\n", "Data columns (total 24 columns):\n", "Air pollution 37 non-null float64\n", "Assault rate 37 non-null float64\n", "Consultation on rule-making 37 non-null float64\n", "Dwellings without basic facilities 37 non-null float64\n", "Educational attainment 37 non-null float64\n", "Employees working very long hours 37 non-null float64\n", "Employment rate 37 non-null float64\n", "Homicide rate 37 non-null float64\n", "Household net adjusted disposable income 37 non-null float64\n", "Household net financial wealth 37 non-null float64\n", "Housing expenditure 37 non-null float64\n", "Job security 37 non-null float64\n", "Life expectancy 37 non-null float64\n", "Life satisfaction 37 non-null float64\n", "Long-term unemployment rate 37 non-null float64\n", "Personal earnings 37 non-null float64\n", "Quality of support network 37 non-null float64\n", "Rooms per person 37 non-null float64\n", "Self-reported health 37 non-null float64\n", "Student skills 37 non-null float64\n", "Time devoted to leisure and personal care 37 non-null float64\n", "Voter turnout 37 non-null float64\n", "Water quality 37 non-null float64\n", "Years in education 37 non-null float64\n", "dtypes: float64(24)\n", "memory usage: 7.2+ KB\n", "```" ] }, { "cell_type": "code", "execution_count": 3, "id": "86e530da", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 3292 entries, 0 to 3291\n", "Data columns (total 17 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 LOCATION 3292 non-null object \n", " 1 Country 3292 non-null object \n", " 2 INDICATOR 3292 non-null object \n", " 3 Indicator 3292 non-null object \n", " 4 MEASURE 3292 non-null object \n", " 5 Measure 3292 non-null object \n", " 6 INEQUALITY 3292 non-null object \n", " 7 Inequality 3292 non-null object \n", " 8 Unit Code 3292 non-null object \n", " 9 Unit 3292 non-null object \n", " 10 PowerCode Code 3292 non-null int64 \n", " 11 PowerCode 3292 non-null object \n", " 12 Reference Period Code 0 non-null float64\n", " 13 Reference Period 0 non-null float64\n", " 14 Value 3292 non-null float64\n", " 15 Flag Codes 1120 non-null object \n", " 16 Flags 1120 non-null object \n", "dtypes: float64(3), int64(1), object(13)\n", "memory usage: 437.3+ KB\n" ] } ], "source": [ "# Load the data\n", "import numpy as np\n", "import pandas as pd\n", "url = \"https://raw.githubusercontent.com/machine-learning-course-uac/1-ml-landscape/main/oecd_bli_2015.csv\"\n", "oecd_bli = pd.read_csv(url, thousands=',')\n", "oecd_bli.info()" ] }, { "cell_type": "code", "execution_count": 4, "id": "7a58ef40", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LOCATIONCountryINDICATORIndicatorMEASUREMeasureINEQUALITYInequalityUnit CodeUnitPowerCode CodePowerCodeReference Period CodeReference PeriodValueFlag CodesFlags
0AUSAustraliaHO_BASEDwellings without basic facilitiesLValueTOTTotalPCPercentage0unitsNaNNaN1.10EEstimated value
1AUTAustriaHO_BASEDwellings without basic facilitiesLValueTOTTotalPCPercentage0unitsNaNNaN1.00NaNNaN
2BELBelgiumHO_BASEDwellings without basic facilitiesLValueTOTTotalPCPercentage0unitsNaNNaN2.00NaNNaN
3CANCanadaHO_BASEDwellings without basic facilitiesLValueTOTTotalPCPercentage0unitsNaNNaN0.20NaNNaN
4CZECzech RepublicHO_BASEDwellings without basic facilitiesLValueTOTTotalPCPercentage0unitsNaNNaN0.90NaNNaN
......................................................
3287ESTEstoniaWL_TNOWTime devoted to leisure and personal careLValueWMNWomenHOURHours0unitsNaNNaN14.43NaNNaN
3288ISRIsraelWL_TNOWTime devoted to leisure and personal careLValueWMNWomenHOURHours0unitsNaNNaN14.24EEstimated value
3289RUSRussiaWL_TNOWTime devoted to leisure and personal careLValueWMNWomenHOURHours0unitsNaNNaN14.75EEstimated value
3290SVNSloveniaWL_TNOWTime devoted to leisure and personal careLValueWMNWomenHOURHours0unitsNaNNaN14.12NaNNaN
3291OECDOECD - TotalWL_TNOWTime devoted to leisure and personal careLValueWMNWomenHOURHours0unitsNaNNaN14.74NaNNaN
\n", "

3292 rows × 17 columns

\n", "
" ], "text/plain": [ " LOCATION Country INDICATOR \\\n", "0 AUS Australia HO_BASE \n", "1 AUT Austria HO_BASE \n", "2 BEL Belgium HO_BASE \n", "3 CAN Canada HO_BASE \n", "4 CZE Czech Republic HO_BASE \n", "... ... ... ... \n", "3287 EST Estonia WL_TNOW \n", "3288 ISR Israel WL_TNOW \n", "3289 RUS Russia WL_TNOW \n", "3290 SVN Slovenia WL_TNOW \n", "3291 OECD OECD - Total WL_TNOW \n", "\n", " Indicator MEASURE Measure INEQUALITY \\\n", "0 Dwellings without basic facilities L Value TOT \n", "1 Dwellings without basic facilities L Value TOT \n", "2 Dwellings without basic facilities L Value TOT \n", "3 Dwellings without basic facilities L Value TOT \n", "4 Dwellings without basic facilities L Value TOT \n", "... ... ... ... ... \n", "3287 Time devoted to leisure and personal care L Value WMN \n", "3288 Time devoted to leisure and personal care L Value WMN \n", "3289 Time devoted to leisure and personal care L Value WMN \n", "3290 Time devoted to leisure and personal care L Value WMN \n", "3291 Time devoted to leisure and personal care L Value WMN \n", "\n", " Inequality Unit Code Unit PowerCode Code PowerCode \\\n", "0 Total PC Percentage 0 units \n", "1 Total PC Percentage 0 units \n", "2 Total PC Percentage 0 units \n", "3 Total PC Percentage 0 units \n", "4 Total PC Percentage 0 units \n", "... ... ... ... ... ... \n", "3287 Women HOUR Hours 0 units \n", "3288 Women HOUR Hours 0 units \n", "3289 Women HOUR Hours 0 units \n", "3290 Women HOUR Hours 0 units \n", "3291 Women HOUR Hours 0 units \n", "\n", " Reference Period Code Reference Period Value Flag Codes \\\n", "0 NaN NaN 1.10 E \n", "1 NaN NaN 1.00 NaN \n", "2 NaN NaN 2.00 NaN \n", "3 NaN NaN 0.20 NaN \n", "4 NaN NaN 0.90 NaN \n", "... ... ... ... ... \n", "3287 NaN NaN 14.43 NaN \n", "3288 NaN NaN 14.24 E \n", "3289 NaN NaN 14.75 E \n", "3290 NaN NaN 14.12 NaN \n", "3291 NaN NaN 14.74 NaN \n", "\n", " Flags \n", "0 Estimated value \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "... ... \n", "3287 NaN \n", "3288 Estimated value \n", "3289 Estimated value \n", "3290 NaN \n", "3291 NaN \n", "\n", "[3292 rows x 17 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "oecd_bli" ] }, { "cell_type": "code", "execution_count": 10, "id": "d1c803f7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Int64Index: 888 entries, 0 to 3217\n", "Data columns (total 17 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 LOCATION 888 non-null object \n", " 1 Country 888 non-null object \n", " 2 INDICATOR 888 non-null object \n", " 3 Indicator 888 non-null object \n", " 4 MEASURE 888 non-null object \n", " 5 Measure 888 non-null object \n", " 6 INEQUALITY 888 non-null object \n", " 7 Inequality 888 non-null object \n", " 8 Unit Code 888 non-null object \n", " 9 Unit 888 non-null object \n", " 10 PowerCode Code 888 non-null int64 \n", " 11 PowerCode 888 non-null object \n", " 12 Reference Period Code 0 non-null float64\n", " 13 Reference Period 0 non-null float64\n", " 14 Value 888 non-null float64\n", " 15 Flag Codes 58 non-null object \n", " 16 Flags 58 non-null object \n", "dtypes: float64(3), int64(1), object(13)\n", "memory usage: 124.9+ KB\n" ] } ], "source": [ "life_sat_total = oecd_bli[oecd_bli[\"INEQUALITY\"]==\"TOT\"]\n", "life_sat_total.info()" ] }, { "cell_type": "code", "execution_count": 11, "id": "426e54c9", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Int64Index: 24 entries, 1 to 3182\n", "Data columns (total 17 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 LOCATION 24 non-null object \n", " 1 Country 24 non-null object \n", " 2 INDICATOR 24 non-null object \n", " 3 Indicator 24 non-null object \n", " 4 MEASURE 24 non-null object \n", " 5 Measure 24 non-null object \n", " 6 INEQUALITY 24 non-null object \n", " 7 Inequality 24 non-null object \n", " 8 Unit Code 24 non-null object \n", " 9 Unit 24 non-null object \n", " 10 PowerCode Code 24 non-null int64 \n", " 11 PowerCode 24 non-null object \n", " 12 Reference Period Code 0 non-null float64\n", " 13 Reference Period 0 non-null float64\n", " 14 Value 24 non-null float64\n", " 15 Flag Codes 0 non-null object \n", " 16 Flags 0 non-null object \n", "dtypes: float64(3), int64(1), object(13)\n", "memory usage: 3.4+ KB\n" ] } ], "source": [ "new = life_sat_total[life_sat_total[\"Country\"]=='Austria']\n", "new.info()" ] }, { "cell_type": "code", "execution_count": 15, "id": "20e499fb", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IndicatorAir pollutionAssault rateConsultation on rule-makingDwellings without basic facilitiesEducational attainmentEmployees working very long hoursEmployment rateHomicide rateHousehold net adjusted disposable incomeHousehold net financial wealth...Long-term unemployment ratePersonal earningsQuality of support networkRooms per personSelf-reported healthStudent skillsTime devoted to leisure and personal careVoter turnoutWater qualityYears in education
Country
Australia13.02.110.51.176.014.0272.00.831588.047657.0...1.0850449.092.02.385.0512.014.4193.091.019.4
Austria27.03.47.11.083.07.6172.00.431173.049887.0...1.1945199.089.01.669.0500.014.4675.094.017.0
Belgium21.06.64.52.072.04.5762.01.128307.083876.0...3.8848082.094.02.274.0509.015.7189.087.018.9
Brazil18.07.94.06.745.010.4167.025.511664.06844.0...1.9717177.090.01.669.0402.014.9779.072.016.3
Canada15.01.310.50.289.03.9472.01.529365.067913.0...0.9046911.092.02.589.0522.014.2561.091.017.2
Chile46.06.92.09.457.015.4262.04.414533.017733.0...1.5922101.086.01.259.0436.014.4149.073.016.5
Czech Republic16.02.86.80.992.06.9868.00.818404.017299.0...3.1220338.085.01.460.0500.014.9859.085.018.1
Denmark15.03.97.00.978.02.0373.00.326491.044488.0...1.7848347.095.01.972.0498.016.0688.094.019.4
Estonia9.05.53.38.190.03.3068.04.815167.07680.0...3.8218944.089.01.554.0526.014.9064.079.017.5
Finland15.02.49.00.685.03.5869.01.427927.018761.0...1.7340060.095.01.965.0529.014.8969.094.019.7
France12.05.03.50.573.08.1564.00.628799.048741.0...3.9940242.087.01.867.0500.015.3380.082.016.4
Germany16.03.64.50.186.05.2573.00.531252.050394.0...2.3743682.094.01.865.0515.015.3172.095.018.2
Greece27.03.76.50.768.06.1649.01.618575.014579.0...18.3925503.083.01.274.0466.014.9164.069.018.6
Hungary15.03.67.94.882.03.1958.01.315442.013277.0...5.1020948.087.01.157.0487.015.0462.077.017.6
Iceland18.02.75.10.471.012.2582.00.323965.043045.0...1.1855716.096.01.577.0484.014.6181.097.019.8
Ireland13.02.69.00.275.04.2060.00.823917.031580.0...8.3949506.096.02.182.0516.015.1970.080.017.6
Israel21.06.42.53.785.016.0367.02.322104.052933.0...0.7928817.087.01.280.0474.014.4868.068.015.8
Italy21.04.75.01.157.03.6656.00.725166.054987.0...6.9434561.090.01.466.0490.014.9875.071.016.8
Japan24.01.47.36.494.022.2672.00.326111.086764.0...1.6735405.089.01.830.0540.014.9353.085.016.3
Korea30.02.110.44.282.018.7264.01.119510.029091.0...0.0136354.072.01.435.0542.014.6376.078.017.5
Luxembourg12.04.36.00.178.03.4766.00.438951.061765.0...1.7856021.087.02.072.0490.015.1291.086.015.1
Mexico30.012.89.04.237.028.8361.023.413085.09056.0...0.0816193.077.01.066.0417.013.8963.067.014.4
Netherlands30.04.96.10.073.00.4574.00.927888.077961.0...2.4047590.090.02.076.0519.015.4475.092.018.7
New Zealand11.02.210.30.274.013.8773.01.223815.028290.0...0.7535609.094.02.490.0509.014.8777.089.018.1
Norway16.03.38.10.382.02.8275.00.633492.08797.0...0.3250282.094.02.076.0496.015.5678.094.017.9
OECD - Total20.03.97.32.475.012.5165.04.025908.067139.0...2.7936118.088.01.868.0497.014.9768.081.017.7
Poland33.01.410.83.290.07.4160.00.917852.010919.0...3.7722655.091.01.158.0521.014.2055.079.018.4
Portugal18.05.76.50.938.09.6261.01.120086.031245.0...9.1123688.086.01.646.0488.014.9558.086.017.6
Russia15.03.82.515.194.00.1669.012.819292.03412.0...1.7020885.090.00.937.0481.014.9765.056.016.0
Slovak Republic13.03.06.60.692.07.0260.01.217503.08663.0...9.4620307.090.01.166.0472.014.9959.081.016.3
Slovenia26.03.910.30.585.05.6363.00.419326.018465.0...5.1532037.090.01.565.0499.014.6252.088.018.4
Spain24.04.27.30.155.05.8956.00.622477.024774.0...12.9634824.095.01.972.0490.016.0669.071.017.6
Sweden10.05.110.90.088.01.1374.00.729185.060328.0...1.3740818.092.01.781.0482.015.1186.095.019.3
Switzerland20.04.28.40.086.06.7280.00.533491.0108823.0...1.4654236.096.01.881.0518.014.9849.096.017.3
Turkey35.05.05.512.734.040.8650.01.214095.03251.0...2.3716919.086.01.168.0462.013.4288.062.016.4
United Kingdom13.01.911.50.278.012.7071.00.327029.060778.0...2.7741192.091.01.974.0502.014.8366.088.016.4
United States18.01.58.30.189.011.3067.05.241355.0145769.0...1.9156340.090.02.488.0492.014.2768.085.017.2
\n", "

37 rows × 24 columns

\n", "
" ], "text/plain": [ "Indicator Air pollution Assault rate Consultation on rule-making \\\n", "Country \n", "Australia 13.0 2.1 10.5 \n", "Austria 27.0 3.4 7.1 \n", "Belgium 21.0 6.6 4.5 \n", "Brazil 18.0 7.9 4.0 \n", "Canada 15.0 1.3 10.5 \n", "Chile 46.0 6.9 2.0 \n", "Czech Republic 16.0 2.8 6.8 \n", "Denmark 15.0 3.9 7.0 \n", "Estonia 9.0 5.5 3.3 \n", "Finland 15.0 2.4 9.0 \n", "France 12.0 5.0 3.5 \n", "Germany 16.0 3.6 4.5 \n", "Greece 27.0 3.7 6.5 \n", "Hungary 15.0 3.6 7.9 \n", "Iceland 18.0 2.7 5.1 \n", "Ireland 13.0 2.6 9.0 \n", "Israel 21.0 6.4 2.5 \n", "Italy 21.0 4.7 5.0 \n", "Japan 24.0 1.4 7.3 \n", "Korea 30.0 2.1 10.4 \n", "Luxembourg 12.0 4.3 6.0 \n", "Mexico 30.0 12.8 9.0 \n", "Netherlands 30.0 4.9 6.1 \n", "New Zealand 11.0 2.2 10.3 \n", "Norway 16.0 3.3 8.1 \n", "OECD - Total 20.0 3.9 7.3 \n", "Poland 33.0 1.4 10.8 \n", "Portugal 18.0 5.7 6.5 \n", "Russia 15.0 3.8 2.5 \n", "Slovak Republic 13.0 3.0 6.6 \n", "Slovenia 26.0 3.9 10.3 \n", "Spain 24.0 4.2 7.3 \n", "Sweden 10.0 5.1 10.9 \n", "Switzerland 20.0 4.2 8.4 \n", "Turkey 35.0 5.0 5.5 \n", "United Kingdom 13.0 1.9 11.5 \n", "United States 18.0 1.5 8.3 \n", "\n", "Indicator Dwellings without basic facilities Educational attainment \\\n", "Country \n", "Australia 1.1 76.0 \n", "Austria 1.0 83.0 \n", "Belgium 2.0 72.0 \n", "Brazil 6.7 45.0 \n", "Canada 0.2 89.0 \n", "Chile 9.4 57.0 \n", "Czech Republic 0.9 92.0 \n", "Denmark 0.9 78.0 \n", "Estonia 8.1 90.0 \n", "Finland 0.6 85.0 \n", "France 0.5 73.0 \n", "Germany 0.1 86.0 \n", "Greece 0.7 68.0 \n", "Hungary 4.8 82.0 \n", "Iceland 0.4 71.0 \n", "Ireland 0.2 75.0 \n", "Israel 3.7 85.0 \n", "Italy 1.1 57.0 \n", "Japan 6.4 94.0 \n", "Korea 4.2 82.0 \n", "Luxembourg 0.1 78.0 \n", "Mexico 4.2 37.0 \n", "Netherlands 0.0 73.0 \n", "New Zealand 0.2 74.0 \n", "Norway 0.3 82.0 \n", "OECD - Total 2.4 75.0 \n", "Poland 3.2 90.0 \n", "Portugal 0.9 38.0 \n", "Russia 15.1 94.0 \n", "Slovak Republic 0.6 92.0 \n", "Slovenia 0.5 85.0 \n", "Spain 0.1 55.0 \n", "Sweden 0.0 88.0 \n", "Switzerland 0.0 86.0 \n", "Turkey 12.7 34.0 \n", "United Kingdom 0.2 78.0 \n", "United States 0.1 89.0 \n", "\n", "Indicator Employees working very long hours Employment rate \\\n", "Country \n", "Australia 14.02 72.0 \n", "Austria 7.61 72.0 \n", "Belgium 4.57 62.0 \n", "Brazil 10.41 67.0 \n", "Canada 3.94 72.0 \n", "Chile 15.42 62.0 \n", "Czech Republic 6.98 68.0 \n", "Denmark 2.03 73.0 \n", "Estonia 3.30 68.0 \n", "Finland 3.58 69.0 \n", "France 8.15 64.0 \n", "Germany 5.25 73.0 \n", "Greece 6.16 49.0 \n", "Hungary 3.19 58.0 \n", "Iceland 12.25 82.0 \n", "Ireland 4.20 60.0 \n", "Israel 16.03 67.0 \n", "Italy 3.66 56.0 \n", "Japan 22.26 72.0 \n", "Korea 18.72 64.0 \n", "Luxembourg 3.47 66.0 \n", "Mexico 28.83 61.0 \n", "Netherlands 0.45 74.0 \n", "New Zealand 13.87 73.0 \n", "Norway 2.82 75.0 \n", "OECD - Total 12.51 65.0 \n", "Poland 7.41 60.0 \n", "Portugal 9.62 61.0 \n", "Russia 0.16 69.0 \n", "Slovak Republic 7.02 60.0 \n", "Slovenia 5.63 63.0 \n", "Spain 5.89 56.0 \n", "Sweden 1.13 74.0 \n", "Switzerland 6.72 80.0 \n", "Turkey 40.86 50.0 \n", "United Kingdom 12.70 71.0 \n", "United States 11.30 67.0 \n", "\n", "Indicator Homicide rate Household net adjusted disposable income \\\n", "Country \n", "Australia 0.8 31588.0 \n", "Austria 0.4 31173.0 \n", "Belgium 1.1 28307.0 \n", "Brazil 25.5 11664.0 \n", "Canada 1.5 29365.0 \n", "Chile 4.4 14533.0 \n", "Czech Republic 0.8 18404.0 \n", "Denmark 0.3 26491.0 \n", "Estonia 4.8 15167.0 \n", "Finland 1.4 27927.0 \n", "France 0.6 28799.0 \n", "Germany 0.5 31252.0 \n", "Greece 1.6 18575.0 \n", "Hungary 1.3 15442.0 \n", "Iceland 0.3 23965.0 \n", "Ireland 0.8 23917.0 \n", "Israel 2.3 22104.0 \n", "Italy 0.7 25166.0 \n", "Japan 0.3 26111.0 \n", "Korea 1.1 19510.0 \n", "Luxembourg 0.4 38951.0 \n", "Mexico 23.4 13085.0 \n", "Netherlands 0.9 27888.0 \n", "New Zealand 1.2 23815.0 \n", "Norway 0.6 33492.0 \n", "OECD - Total 4.0 25908.0 \n", "Poland 0.9 17852.0 \n", "Portugal 1.1 20086.0 \n", "Russia 12.8 19292.0 \n", "Slovak Republic 1.2 17503.0 \n", "Slovenia 0.4 19326.0 \n", "Spain 0.6 22477.0 \n", "Sweden 0.7 29185.0 \n", "Switzerland 0.5 33491.0 \n", "Turkey 1.2 14095.0 \n", "United Kingdom 0.3 27029.0 \n", "United States 5.2 41355.0 \n", "\n", "Indicator Household net financial wealth ... \\\n", "Country ... \n", "Australia 47657.0 ... \n", "Austria 49887.0 ... \n", "Belgium 83876.0 ... \n", "Brazil 6844.0 ... \n", "Canada 67913.0 ... \n", "Chile 17733.0 ... \n", "Czech Republic 17299.0 ... \n", "Denmark 44488.0 ... \n", "Estonia 7680.0 ... \n", "Finland 18761.0 ... \n", "France 48741.0 ... \n", "Germany 50394.0 ... \n", "Greece 14579.0 ... \n", "Hungary 13277.0 ... \n", "Iceland 43045.0 ... \n", "Ireland 31580.0 ... \n", "Israel 52933.0 ... \n", "Italy 54987.0 ... \n", "Japan 86764.0 ... \n", "Korea 29091.0 ... \n", "Luxembourg 61765.0 ... \n", "Mexico 9056.0 ... \n", "Netherlands 77961.0 ... \n", "New Zealand 28290.0 ... \n", "Norway 8797.0 ... \n", "OECD - Total 67139.0 ... \n", "Poland 10919.0 ... \n", "Portugal 31245.0 ... \n", "Russia 3412.0 ... \n", "Slovak Republic 8663.0 ... \n", "Slovenia 18465.0 ... \n", "Spain 24774.0 ... \n", "Sweden 60328.0 ... \n", "Switzerland 108823.0 ... \n", "Turkey 3251.0 ... \n", "United Kingdom 60778.0 ... \n", "United States 145769.0 ... \n", "\n", "Indicator Long-term unemployment rate Personal earnings \\\n", "Country \n", "Australia 1.08 50449.0 \n", "Austria 1.19 45199.0 \n", "Belgium 3.88 48082.0 \n", "Brazil 1.97 17177.0 \n", "Canada 0.90 46911.0 \n", "Chile 1.59 22101.0 \n", "Czech Republic 3.12 20338.0 \n", "Denmark 1.78 48347.0 \n", "Estonia 3.82 18944.0 \n", "Finland 1.73 40060.0 \n", "France 3.99 40242.0 \n", "Germany 2.37 43682.0 \n", "Greece 18.39 25503.0 \n", "Hungary 5.10 20948.0 \n", "Iceland 1.18 55716.0 \n", "Ireland 8.39 49506.0 \n", "Israel 0.79 28817.0 \n", "Italy 6.94 34561.0 \n", "Japan 1.67 35405.0 \n", "Korea 0.01 36354.0 \n", "Luxembourg 1.78 56021.0 \n", "Mexico 0.08 16193.0 \n", "Netherlands 2.40 47590.0 \n", "New Zealand 0.75 35609.0 \n", "Norway 0.32 50282.0 \n", "OECD - Total 2.79 36118.0 \n", "Poland 3.77 22655.0 \n", "Portugal 9.11 23688.0 \n", "Russia 1.70 20885.0 \n", "Slovak Republic 9.46 20307.0 \n", "Slovenia 5.15 32037.0 \n", "Spain 12.96 34824.0 \n", "Sweden 1.37 40818.0 \n", "Switzerland 1.46 54236.0 \n", "Turkey 2.37 16919.0 \n", "United Kingdom 2.77 41192.0 \n", "United States 1.91 56340.0 \n", "\n", "Indicator Quality of support network Rooms per person \\\n", "Country \n", "Australia 92.0 2.3 \n", "Austria 89.0 1.6 \n", "Belgium 94.0 2.2 \n", "Brazil 90.0 1.6 \n", "Canada 92.0 2.5 \n", "Chile 86.0 1.2 \n", "Czech Republic 85.0 1.4 \n", "Denmark 95.0 1.9 \n", "Estonia 89.0 1.5 \n", "Finland 95.0 1.9 \n", "France 87.0 1.8 \n", "Germany 94.0 1.8 \n", "Greece 83.0 1.2 \n", "Hungary 87.0 1.1 \n", "Iceland 96.0 1.5 \n", "Ireland 96.0 2.1 \n", "Israel 87.0 1.2 \n", "Italy 90.0 1.4 \n", "Japan 89.0 1.8 \n", "Korea 72.0 1.4 \n", "Luxembourg 87.0 2.0 \n", "Mexico 77.0 1.0 \n", "Netherlands 90.0 2.0 \n", "New Zealand 94.0 2.4 \n", "Norway 94.0 2.0 \n", "OECD - Total 88.0 1.8 \n", "Poland 91.0 1.1 \n", "Portugal 86.0 1.6 \n", "Russia 90.0 0.9 \n", "Slovak Republic 90.0 1.1 \n", "Slovenia 90.0 1.5 \n", "Spain 95.0 1.9 \n", "Sweden 92.0 1.7 \n", "Switzerland 96.0 1.8 \n", "Turkey 86.0 1.1 \n", "United Kingdom 91.0 1.9 \n", "United States 90.0 2.4 \n", "\n", "Indicator Self-reported health Student skills \\\n", "Country \n", "Australia 85.0 512.0 \n", "Austria 69.0 500.0 \n", "Belgium 74.0 509.0 \n", "Brazil 69.0 402.0 \n", "Canada 89.0 522.0 \n", "Chile 59.0 436.0 \n", "Czech Republic 60.0 500.0 \n", "Denmark 72.0 498.0 \n", "Estonia 54.0 526.0 \n", "Finland 65.0 529.0 \n", "France 67.0 500.0 \n", "Germany 65.0 515.0 \n", "Greece 74.0 466.0 \n", "Hungary 57.0 487.0 \n", "Iceland 77.0 484.0 \n", "Ireland 82.0 516.0 \n", "Israel 80.0 474.0 \n", "Italy 66.0 490.0 \n", "Japan 30.0 540.0 \n", "Korea 35.0 542.0 \n", "Luxembourg 72.0 490.0 \n", "Mexico 66.0 417.0 \n", "Netherlands 76.0 519.0 \n", "New Zealand 90.0 509.0 \n", "Norway 76.0 496.0 \n", "OECD - Total 68.0 497.0 \n", "Poland 58.0 521.0 \n", "Portugal 46.0 488.0 \n", "Russia 37.0 481.0 \n", "Slovak Republic 66.0 472.0 \n", "Slovenia 65.0 499.0 \n", "Spain 72.0 490.0 \n", "Sweden 81.0 482.0 \n", "Switzerland 81.0 518.0 \n", "Turkey 68.0 462.0 \n", "United Kingdom 74.0 502.0 \n", "United States 88.0 492.0 \n", "\n", "Indicator Time devoted to leisure and personal care Voter turnout \\\n", "Country \n", "Australia 14.41 93.0 \n", "Austria 14.46 75.0 \n", "Belgium 15.71 89.0 \n", "Brazil 14.97 79.0 \n", "Canada 14.25 61.0 \n", "Chile 14.41 49.0 \n", "Czech Republic 14.98 59.0 \n", "Denmark 16.06 88.0 \n", "Estonia 14.90 64.0 \n", "Finland 14.89 69.0 \n", "France 15.33 80.0 \n", "Germany 15.31 72.0 \n", "Greece 14.91 64.0 \n", "Hungary 15.04 62.0 \n", "Iceland 14.61 81.0 \n", "Ireland 15.19 70.0 \n", "Israel 14.48 68.0 \n", "Italy 14.98 75.0 \n", "Japan 14.93 53.0 \n", "Korea 14.63 76.0 \n", "Luxembourg 15.12 91.0 \n", "Mexico 13.89 63.0 \n", "Netherlands 15.44 75.0 \n", "New Zealand 14.87 77.0 \n", "Norway 15.56 78.0 \n", "OECD - Total 14.97 68.0 \n", "Poland 14.20 55.0 \n", "Portugal 14.95 58.0 \n", "Russia 14.97 65.0 \n", "Slovak Republic 14.99 59.0 \n", "Slovenia 14.62 52.0 \n", "Spain 16.06 69.0 \n", "Sweden 15.11 86.0 \n", "Switzerland 14.98 49.0 \n", "Turkey 13.42 88.0 \n", "United Kingdom 14.83 66.0 \n", "United States 14.27 68.0 \n", "\n", "Indicator Water quality Years in education \n", "Country \n", "Australia 91.0 19.4 \n", "Austria 94.0 17.0 \n", "Belgium 87.0 18.9 \n", "Brazil 72.0 16.3 \n", "Canada 91.0 17.2 \n", "Chile 73.0 16.5 \n", "Czech Republic 85.0 18.1 \n", "Denmark 94.0 19.4 \n", "Estonia 79.0 17.5 \n", "Finland 94.0 19.7 \n", "France 82.0 16.4 \n", "Germany 95.0 18.2 \n", "Greece 69.0 18.6 \n", "Hungary 77.0 17.6 \n", "Iceland 97.0 19.8 \n", "Ireland 80.0 17.6 \n", "Israel 68.0 15.8 \n", "Italy 71.0 16.8 \n", "Japan 85.0 16.3 \n", "Korea 78.0 17.5 \n", "Luxembourg 86.0 15.1 \n", "Mexico 67.0 14.4 \n", "Netherlands 92.0 18.7 \n", "New Zealand 89.0 18.1 \n", "Norway 94.0 17.9 \n", "OECD - Total 81.0 17.7 \n", "Poland 79.0 18.4 \n", "Portugal 86.0 17.6 \n", "Russia 56.0 16.0 \n", "Slovak Republic 81.0 16.3 \n", "Slovenia 88.0 18.4 \n", "Spain 71.0 17.6 \n", "Sweden 95.0 19.3 \n", "Switzerland 96.0 17.3 \n", "Turkey 62.0 16.4 \n", "United Kingdom 88.0 16.4 \n", "United States 85.0 17.2 \n", "\n", "[37 rows x 24 columns]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "life_sat_pivoted = life_sat_total.pivot(index=\"Country\", columns=\"Indicator\", values=\"Value\")\n", "life_sat_pivoted" ] }, { "cell_type": "markdown", "id": "6bb04a1c", "metadata": {}, "source": [ "## GDP per capita\n", "The Dataset obtained from the IMF's website at: http://goo.gl/j1MSKe\n", "\n", "### Data description\n", "```\n", "Int64Index: 190 entries, 0 to 189\n", "Data columns (total 7 columns):\n", "Country 190 non-null object\n", "Subject Descriptor 189 non-null object\n", "Units 189 non-null object\n", "Scale 189 non-null object\n", "Country/Series-specific Notes 188 non-null object\n", "2015 187 non-null float64\n", "Estimates Start After 188 non-null float64\n", "dtypes: float64(2), object(5)\n", "memory usage: 11.9+ KB\n", "```\n", "\n", "### Example using Python Pandas\n", "\n", "```\n", ">>> gdp_per_capita = pd.read_csv(\n", "... datapath+\"gdp_per_capita.csv\", thousands=',', delimiter='\\t',\n", "... encoding='latin1', na_values=\"n/a\", index_col=\"Country\")\n", "...\n", ">>> gdp_per_capita.rename(columns={\"2015\": \"GDP per capita\"}, inplace=True)\n", "```" ] }, { "cell_type": "code", "execution_count": 17, "id": "fc31575a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountrySubject DescriptorUnitsScaleCountry/Series-specific Notes2015Estimates Start After
0AfghanistanGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...599.9942013.0
1AlbaniaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...3995.3832010.0
2AlgeriaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...4318.1352014.0
3AngolaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...4100.3152014.0
4Antigua and BarbudaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...14414.3022011.0
........................
185VietnamGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...2088.3442012.0
186YemenGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...1302.9402008.0
187ZambiaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...1350.1512010.0
188ZimbabweGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...1064.3502012.0
189International Monetary Fund, World Economic Ou...NaNNaNNaNNaNNaNNaN
\n", "

190 rows × 7 columns

\n", "
" ], "text/plain": [ " Country \\\n", "0 Afghanistan \n", "1 Albania \n", "2 Algeria \n", "3 Angola \n", "4 Antigua and Barbuda \n", ".. ... \n", "185 Vietnam \n", "186 Yemen \n", "187 Zambia \n", "188 Zimbabwe \n", "189 International Monetary Fund, World Economic Ou... \n", "\n", " Subject Descriptor Units Scale \\\n", "0 Gross domestic product per capita, current prices U.S. dollars Units \n", "1 Gross domestic product per capita, current prices U.S. dollars Units \n", "2 Gross domestic product per capita, current prices U.S. dollars Units \n", "3 Gross domestic product per capita, current prices U.S. dollars Units \n", "4 Gross domestic product per capita, current prices U.S. dollars Units \n", ".. ... ... ... \n", "185 Gross domestic product per capita, current prices U.S. dollars Units \n", "186 Gross domestic product per capita, current prices U.S. dollars Units \n", "187 Gross domestic product per capita, current prices U.S. dollars Units \n", "188 Gross domestic product per capita, current prices U.S. dollars Units \n", "189 NaN NaN NaN \n", "\n", " Country/Series-specific Notes 2015 \\\n", "0 See notes for: Gross domestic product, curren... 599.994 \n", "1 See notes for: Gross domestic product, curren... 3995.383 \n", "2 See notes for: Gross domestic product, curren... 4318.135 \n", "3 See notes for: Gross domestic product, curren... 4100.315 \n", "4 See notes for: Gross domestic product, curren... 14414.302 \n", ".. ... ... \n", "185 See notes for: Gross domestic product, curren... 2088.344 \n", "186 See notes for: Gross domestic product, curren... 1302.940 \n", "187 See notes for: Gross domestic product, curren... 1350.151 \n", "188 See notes for: Gross domestic product, curren... 1064.350 \n", "189 NaN NaN \n", "\n", " Estimates Start After \n", "0 2013.0 \n", "1 2010.0 \n", "2 2014.0 \n", "3 2014.0 \n", "4 2011.0 \n", ".. ... \n", "185 2012.0 \n", "186 2008.0 \n", "187 2010.0 \n", "188 2012.0 \n", "189 NaN \n", "\n", "[190 rows x 7 columns]" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "url2 = \"https://raw.githubusercontent.com/machine-learning-course-uac/1-ml-landscape/main/gdp_per_capita.csv\"\n", "gdp_per_capita = pd.read_csv(url2,thousands=',',delimiter='\\t', encoding='latin1', na_values=\"n/a\")\n", "gdp_per_capita" ] }, { "cell_type": "code", "execution_count": 18, "id": "b355820d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountrySubject DescriptorUnitsScaleCountry/Series-specific NotesGDPEstimates Start After
0AfghanistanGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...599.9942013.0
1AlbaniaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...3995.3832010.0
2AlgeriaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...4318.1352014.0
3AngolaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...4100.3152014.0
4Antigua and BarbudaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...14414.3022011.0
........................
185VietnamGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...2088.3442012.0
186YemenGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...1302.9402008.0
187ZambiaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...1350.1512010.0
188ZimbabweGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...1064.3502012.0
189International Monetary Fund, World Economic Ou...NaNNaNNaNNaNNaNNaN
\n", "

190 rows × 7 columns

\n", "
" ], "text/plain": [ " Country \\\n", "0 Afghanistan \n", "1 Albania \n", "2 Algeria \n", "3 Angola \n", "4 Antigua and Barbuda \n", ".. ... \n", "185 Vietnam \n", "186 Yemen \n", "187 Zambia \n", "188 Zimbabwe \n", "189 International Monetary Fund, World Economic Ou... \n", "\n", " Subject Descriptor Units Scale \\\n", "0 Gross domestic product per capita, current prices U.S. dollars Units \n", "1 Gross domestic product per capita, current prices U.S. dollars Units \n", "2 Gross domestic product per capita, current prices U.S. dollars Units \n", "3 Gross domestic product per capita, current prices U.S. dollars Units \n", "4 Gross domestic product per capita, current prices U.S. dollars Units \n", ".. ... ... ... \n", "185 Gross domestic product per capita, current prices U.S. dollars Units \n", "186 Gross domestic product per capita, current prices U.S. dollars Units \n", "187 Gross domestic product per capita, current prices U.S. dollars Units \n", "188 Gross domestic product per capita, current prices U.S. dollars Units \n", "189 NaN NaN NaN \n", "\n", " Country/Series-specific Notes GDP \\\n", "0 See notes for: Gross domestic product, curren... 599.994 \n", "1 See notes for: Gross domestic product, curren... 3995.383 \n", "2 See notes for: Gross domestic product, curren... 4318.135 \n", "3 See notes for: Gross domestic product, curren... 4100.315 \n", "4 See notes for: Gross domestic product, curren... 14414.302 \n", ".. ... ... \n", "185 See notes for: Gross domestic product, curren... 2088.344 \n", "186 See notes for: Gross domestic product, curren... 1302.940 \n", "187 See notes for: Gross domestic product, curren... 1350.151 \n", "188 See notes for: Gross domestic product, curren... 1064.350 \n", "189 NaN NaN \n", "\n", " Estimates Start After \n", "0 2013.0 \n", "1 2010.0 \n", "2 2014.0 \n", "3 2014.0 \n", "4 2011.0 \n", ".. ... \n", "185 2012.0 \n", "186 2008.0 \n", "187 2010.0 \n", "188 2012.0 \n", "189 NaN \n", "\n", "[190 rows x 7 columns]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdp_per_capita.rename(columns={\"2015\": \"GDP\"}, inplace=True)\n", "gdp_per_capita" ] }, { "cell_type": "code", "execution_count": 19, "id": "44bce56b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Subject DescriptorUnitsScaleCountry/Series-specific NotesGDPEstimates Start After
Country
AfghanistanGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...599.9942013.0
AlbaniaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...3995.3832010.0
AlgeriaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...4318.1352014.0
AngolaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...4100.3152014.0
Antigua and BarbudaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...14414.3022011.0
.....................
VietnamGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...2088.3442012.0
YemenGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...1302.9402008.0
ZambiaGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...1350.1512010.0
ZimbabweGross domestic product per capita, current pricesU.S. dollarsUnitsSee notes for: Gross domestic product, curren...1064.3502012.0
International Monetary Fund, World Economic Outlook Database, April 2016NaNNaNNaNNaNNaNNaN
\n", "

190 rows × 6 columns

\n", "
" ], "text/plain": [ " Subject Descriptor \\\n", "Country \n", "Afghanistan Gross domestic product per capita, current prices \n", "Albania Gross domestic product per capita, current prices \n", "Algeria Gross domestic product per capita, current prices \n", "Angola Gross domestic product per capita, current prices \n", "Antigua and Barbuda Gross domestic product per capita, current prices \n", "... ... \n", "Vietnam Gross domestic product per capita, current prices \n", "Yemen Gross domestic product per capita, current prices \n", "Zambia Gross domestic product per capita, current prices \n", "Zimbabwe Gross domestic product per capita, current prices \n", "International Monetary Fund, World Economic Out... NaN \n", "\n", " Units Scale \\\n", "Country \n", "Afghanistan U.S. dollars Units \n", "Albania U.S. dollars Units \n", "Algeria U.S. dollars Units \n", "Angola U.S. dollars Units \n", "Antigua and Barbuda U.S. dollars Units \n", "... ... ... \n", "Vietnam U.S. dollars Units \n", "Yemen U.S. dollars Units \n", "Zambia U.S. dollars Units \n", "Zimbabwe U.S. dollars Units \n", "International Monetary Fund, World Economic Out... NaN NaN \n", "\n", " Country/Series-specific Notes \\\n", "Country \n", "Afghanistan See notes for: Gross domestic product, curren... \n", "Albania See notes for: Gross domestic product, curren... \n", "Algeria See notes for: Gross domestic product, curren... \n", "Angola See notes for: Gross domestic product, curren... \n", "Antigua and Barbuda See notes for: Gross domestic product, curren... \n", "... ... \n", "Vietnam See notes for: Gross domestic product, curren... \n", "Yemen See notes for: Gross domestic product, curren... \n", "Zambia See notes for: Gross domestic product, curren... \n", "Zimbabwe See notes for: Gross domestic product, curren... \n", "International Monetary Fund, World Economic Out... NaN \n", "\n", " GDP \\\n", "Country \n", "Afghanistan 599.994 \n", "Albania 3995.383 \n", "Algeria 4318.135 \n", "Angola 4100.315 \n", "Antigua and Barbuda 14414.302 \n", "... ... \n", "Vietnam 2088.344 \n", "Yemen 1302.940 \n", "Zambia 1350.151 \n", "Zimbabwe 1064.350 \n", "International Monetary Fund, World Economic Out... NaN \n", "\n", " Estimates Start After \n", "Country \n", "Afghanistan 2013.0 \n", "Albania 2010.0 \n", "Algeria 2014.0 \n", "Angola 2014.0 \n", "Antigua and Barbuda 2011.0 \n", "... ... \n", "Vietnam 2012.0 \n", "Yemen 2008.0 \n", "Zambia 2010.0 \n", "Zimbabwe 2012.0 \n", "International Monetary Fund, World Economic Out... NaN \n", "\n", "[190 rows x 6 columns]" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdp_per_capita.set_index(\"Country\", inplace=True)\n", "gdp_per_capita" ] }, { "cell_type": "markdown", "id": "d334c46b", "metadata": {}, "source": [ "# Making all the process in a function " ] }, { "cell_type": "code", "execution_count": 20, "id": "a0cb5245", "metadata": {}, "outputs": [], "source": [ "def CountryStats(oecd, gdp):\n", " # YOUR CODE HERE\n", " return country_stats[[\"GDP\", 'Life satisfaction']].iloc[keep_indices]" ] }, { "cell_type": "code", "execution_count": 10, "id": "37cd9901", "metadata": {}, "outputs": [], "source": [ "import mluac as ml\n", "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 23, "id": "df9ca7f1", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GDP per capitaLife satisfaction
Country
Russia9054.9146.0
Turkey9437.3725.6
Hungary12239.8944.9
Poland12495.3345.8
Slovak Republic15991.7366.1
Estonia17288.0835.6
Greece18064.2884.8
Portugal19121.5925.1
Slovenia20732.4825.7
Spain25864.7216.5
Korea27195.1975.8
Italy29866.5816.0
Japan32485.5455.9
Israel35343.3367.4
New Zealand37044.8917.3
France37675.0066.5
Belgium40106.6326.9
Germany40996.5117.0
Finland41973.9887.4
Canada43331.9617.3
Netherlands43603.1157.3
Austria43724.0316.9
United Kingdom43770.6886.8
Sweden49866.2667.2
Iceland50854.5837.5
Australia50961.8657.3
Ireland51350.7447.0
Denmark52114.1657.5
United States55805.2047.2
\n", "
" ], "text/plain": [ " GDP per capita Life satisfaction\n", "Country \n", "Russia 9054.914 6.0\n", "Turkey 9437.372 5.6\n", "Hungary 12239.894 4.9\n", "Poland 12495.334 5.8\n", "Slovak Republic 15991.736 6.1\n", "Estonia 17288.083 5.6\n", "Greece 18064.288 4.8\n", "Portugal 19121.592 5.1\n", "Slovenia 20732.482 5.7\n", "Spain 25864.721 6.5\n", "Korea 27195.197 5.8\n", "Italy 29866.581 6.0\n", "Japan 32485.545 5.9\n", "Israel 35343.336 7.4\n", "New Zealand 37044.891 7.3\n", "France 37675.006 6.5\n", "Belgium 40106.632 6.9\n", "Germany 40996.511 7.0\n", "Finland 41973.988 7.4\n", "Canada 43331.961 7.3\n", "Netherlands 43603.115 7.3\n", "Austria 43724.031 6.9\n", "United Kingdom 43770.688 6.8\n", "Sweden 49866.266 7.2\n", "Iceland 50854.583 7.5\n", "Australia 50961.865 7.3\n", "Ireland 51350.744 7.0\n", "Denmark 52114.165 7.5\n", "United States 55805.204 7.2" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "url2 = \"https://raw.githubusercontent.com/machine-learning-course-uac/1-ml-landscape/main/gdp_per_capita.csv\"\n", "gdp_per_capita = pd.read_csv(url2,thousands=',',delimiter='\\t', encoding='latin1', na_values=\"n/a\")\n", "url = \"https://raw.githubusercontent.com/machine-learning-course-uac/1-ml-landscape/main/oecd_bli_2015.csv\"\n", "oecd_bli = pd.read_csv(url, thousands=',')\n", "cs = ml.prepare_country_stats(oecd_bli, gdp_per_capita)\n", "cs" ] }, { "cell_type": "markdown", "id": "3466ca29", "metadata": {}, "source": [ "## Exploring the data" ] }, { "cell_type": "code", "execution_count": 18, "id": "5bfd30f2", "metadata": {}, "outputs": [], "source": [ "X = np.c_[cs[\"GDP per capita\"]]\n", "y = np.c_[cs[\"Life satisfaction\"]]" ] }, { "cell_type": "code", "execution_count": 19, "id": "51ac6ffc", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 9054.914],\n", " [ 9437.372],\n", " [12239.894],\n", " [12495.334],\n", " [15991.736],\n", " [17288.083],\n", " [18064.288],\n", " [19121.592],\n", " [20732.482],\n", " [25864.721],\n", " [27195.197],\n", " [29866.581],\n", " [32485.545],\n", " [35343.336],\n", " [37044.891],\n", " [37675.006],\n", " [40106.632],\n", " [40996.511],\n", " [41973.988],\n", " [43331.961],\n", " [43603.115],\n", " [43724.031],\n", " [43770.688],\n", " [49866.266],\n", " [50854.583],\n", " [50961.865],\n", " [51350.744],\n", " [52114.165],\n", " [55805.204]])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X" ] }, { "cell_type": "code", "execution_count": 20, "id": "75460ea4", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.plot(X,y, 'ok', markersize=12, label=\"Data\")\n", "plt.xlabel(\"GDP per capita\", fontsize=20)\n", "plt.ylabel(\"Life satisfaction\", fontsize=20)\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "d4cdb351", "metadata": {}, "source": [ "# Model of GDP and Life Satisfaction\n", "## Fitting and predictions" ] }, { "cell_type": "code", "execution_count": 21, "id": "86ee41bd", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
LinearRegression()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "LinearRegression()" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ ">>> from sklearn.linear_model import LinearRegression\n", "# Select a linear model\n", "model = LinearRegression()\n", "\n", "# Train the model\n", "model.fit(X, y)" ] }, { "cell_type": "code", "execution_count": 22, "id": "25b99bc8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[5.96242338]]\n" ] } ], "source": [ "# Make a prediction for Cyprus\n", "X_new = [[22587]] # Cyprus' GDP per capita\n", "print(model.predict(X_new)) # outputs [[ 5.96242338]]" ] }, { "cell_type": "markdown", "id": "fe2c6057", "metadata": {}, "source": [ "## Extracting data from model\n", "\n", "The Linear regression model\n", "\n", "$$f(\\theta)=x_0 +x_1 \\theta_1 +x_2 \\theta_2 + \\cdots$$" ] }, { "cell_type": "code", "execution_count": 25, "id": "5278c68f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4.853052800266435" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#theta_0\n", "t0 = model.intercept_[0]\n", "t0" ] }, { "cell_type": "code", "execution_count": 27, "id": "7964db8c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[4.91154459e-05]])" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#theta_1\n", "t1 =model.coef_\n", "t1" ] }, { "cell_type": "code", "execution_count": 29, "id": "2c494d36", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "Xest=np.linspace(0, 60000, 1000)\n", "plt.plot(X, t0 + t1*X, \"r\", linewidth=3,label=\"Model\")\n", "plt.plot(X,y, 'ok', markersize=12, label=\"Data\")\n", "plt.xlabel(\"GDP per capita\", fontsize=20)\n", "plt.ylabel(\"Life satisfaction\", fontsize=20)\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "id": "714824a0", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 5 }