{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "# Libreria Pandas\n",
    "\n",
    "**Pandas** es la librería por excelencia para el análisis de datos del lenguaje `Python`. Su nombre proviene de “panel data” (término econométrico). Inspirada en las funcionalidades de `R`, pero con el potencial de este lenguaje de propósito general.\n",
    "\n",
    "**Pandas** incluye todas las funcionalidades necesarias para el proceso de análisis de datos: carga, filtrado, tratamiento, síntesis, agrupamiento, almacenamiento y visualización. Además, se integra con el resto de librerías de cálculo numérico como `Numpy`, `Matplotlib`, `scikit-learn`, …  y de despliegue: `HPC`, `Cloud`, etc.\n",
    "\n",
    "En resumen, **es como una hoja de cálculo -por ejemplo excel- pero con más mucho más potencial!!!**\n",
    "\n",
    "[Características principales](https://github.com/pandas-dev/pandas#main-features)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Todo el trabajo que realizaremos es sobre la estructura de datos básica: el `dataFrame`.\n",
    "\n",
    "Un `dataFrame` es un objeto de dos dimensiones que contiene información. También puede verse como una **hoja de cálculo**, como una tabla de un modelo entidad-relación, o como una colección de una base de datos no relacional.\n",
    "\n",
    "[Documentación](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: pandas in /Users/isaac/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages (1.5.3)\n",
      "Requirement already satisfied: python-dateutil>=2.8.1 in /Users/isaac/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages (from pandas) (2.8.2)\n",
      "Requirement already satisfied: pytz>=2020.1 in /Users/isaac/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages (from pandas) (2023.3)\n",
      "Requirement already satisfied: numpy>=1.21.0 in /Users/isaac/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages (from pandas) (1.23.5)\n",
      "Requirement already satisfied: six>=1.5 in /Users/isaac/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages (from python-dateutil>=2.8.1->pandas) (1.16.0)\n",
      "\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.3.1\u001b[0m\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
   "source": [
    "!uv pip install pandas"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Importación de la libreria\n",
    "```Python\n",
    "import pandas as pd\n",
    "import pandas\n",
    "from pandas import *\n",
    "```\n",
    "\n",
    "Por convención se hace de la siguiente manera, todas las funciones de la libreria se tienen que llamar con el prefijo pd.*:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Carga de datos\n",
    "Vamos a aprender Pandas a través de una serie de proyectos y ejemplos. En esta primera fase, vamos a cargar datos de un fichero _CSV_, recordad que son ficheros donde los atributos/valores de una observación están separados por una coma y las observaciones se separan mediante un salto de línea.\n",
    "\n",
    "Podemos descargar los datos con los que trabajaremos del siguiente enlace [WHO dataset](http://www.exploredata.net/Downloads/WHO-Data-Set)\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Empezamos viendo como se carga un dataframe a partir de un fichero en formato CSV."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "df = pd.read_csv(\"data/WHO.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>151.0</td>\n",
       "      <td>28.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>26088.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>...</td>\n",
       "      <td>692.50</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>257.00</td>\n",
       "      <td>231.9</td>\n",
       "      <td>257.00</td>\n",
       "      <td>5740436.0</td>\n",
       "      <td>5.44</td>\n",
       "      <td>22.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>27.0</td>\n",
       "      <td>98.7</td>\n",
       "      <td>6000.0</td>\n",
       "      <td>93.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>3172.0</td>\n",
       "      <td>0.6</td>\n",
       "      <td>...</td>\n",
       "      <td>3499.12</td>\n",
       "      <td>4.790000e+09</td>\n",
       "      <td>78.14</td>\n",
       "      <td>-2.040000e+09</td>\n",
       "      <td>18.47</td>\n",
       "      <td>15.5</td>\n",
       "      <td>18.47</td>\n",
       "      <td>1431793.9</td>\n",
       "      <td>2.21</td>\n",
       "      <td>45.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>6.0</td>\n",
       "      <td>69.9</td>\n",
       "      <td>5940.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>96.0</td>\n",
       "      <td>33351.0</td>\n",
       "      <td>1.5</td>\n",
       "      <td>...</td>\n",
       "      <td>137535.56</td>\n",
       "      <td>6.970000e+10</td>\n",
       "      <td>351.36</td>\n",
       "      <td>4.700000e+09</td>\n",
       "      <td>40.00</td>\n",
       "      <td>31.2</td>\n",
       "      <td>40.00</td>\n",
       "      <td>20800000.0</td>\n",
       "      <td>2.61</td>\n",
       "      <td>63.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Andorra</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>83.0</td>\n",
       "      <td>83.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Angola</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>146.0</td>\n",
       "      <td>67.4</td>\n",
       "      <td>3890.0</td>\n",
       "      <td>49.0</td>\n",
       "      <td>51.0</td>\n",
       "      <td>16557.0</td>\n",
       "      <td>2.8</td>\n",
       "      <td>...</td>\n",
       "      <td>8991.46</td>\n",
       "      <td>1.490000e+10</td>\n",
       "      <td>27.13</td>\n",
       "      <td>9.140000e+09</td>\n",
       "      <td>164.10</td>\n",
       "      <td>242.5</td>\n",
       "      <td>164.10</td>\n",
       "      <td>8578749.0</td>\n",
       "      <td>4.14</td>\n",
       "      <td>53.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>197</th>\n",
       "      <td>Vietnam</td>\n",
       "      <td>198</td>\n",
       "      <td>6</td>\n",
       "      <td>25.0</td>\n",
       "      <td>90.3</td>\n",
       "      <td>2310.0</td>\n",
       "      <td>91.0</td>\n",
       "      <td>96.0</td>\n",
       "      <td>86206.0</td>\n",
       "      <td>1.4</td>\n",
       "      <td>...</td>\n",
       "      <td>101826.23</td>\n",
       "      <td>4.480000e+10</td>\n",
       "      <td>47.11</td>\n",
       "      <td>-1.940000e+09</td>\n",
       "      <td>20.20</td>\n",
       "      <td>23.4</td>\n",
       "      <td>20.20</td>\n",
       "      <td>21900000.0</td>\n",
       "      <td>2.90</td>\n",
       "      <td>26.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>198</th>\n",
       "      <td>West Bank and Gaza</td>\n",
       "      <td>199</td>\n",
       "      <td>1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>655.86</td>\n",
       "      <td>3.780000e+09</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>28.00</td>\n",
       "      <td>25.8</td>\n",
       "      <td>28.00</td>\n",
       "      <td>2596216.0</td>\n",
       "      <td>3.33</td>\n",
       "      <td>71.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>199</th>\n",
       "      <td>Yemen</td>\n",
       "      <td>200</td>\n",
       "      <td>1</td>\n",
       "      <td>83.0</td>\n",
       "      <td>54.1</td>\n",
       "      <td>2090.0</td>\n",
       "      <td>65.0</td>\n",
       "      <td>85.0</td>\n",
       "      <td>21732.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>...</td>\n",
       "      <td>20148.34</td>\n",
       "      <td>1.150000e+10</td>\n",
       "      <td>114.52</td>\n",
       "      <td>8.310000e+08</td>\n",
       "      <td>82.40</td>\n",
       "      <td>87.9</td>\n",
       "      <td>82.40</td>\n",
       "      <td>5759120.5</td>\n",
       "      <td>4.37</td>\n",
       "      <td>27.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>200</th>\n",
       "      <td>Zambia</td>\n",
       "      <td>201</td>\n",
       "      <td>3</td>\n",
       "      <td>161.0</td>\n",
       "      <td>68.0</td>\n",
       "      <td>1140.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>11696.0</td>\n",
       "      <td>1.9</td>\n",
       "      <td>...</td>\n",
       "      <td>2366.94</td>\n",
       "      <td>4.090000e+09</td>\n",
       "      <td>10.41</td>\n",
       "      <td>-4.470000e+08</td>\n",
       "      <td>175.30</td>\n",
       "      <td>163.8</td>\n",
       "      <td>175.30</td>\n",
       "      <td>4017411.0</td>\n",
       "      <td>1.95</td>\n",
       "      <td>35.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>201</th>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>202</td>\n",
       "      <td>3</td>\n",
       "      <td>101.0</td>\n",
       "      <td>89.5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>88.0</td>\n",
       "      <td>87.0</td>\n",
       "      <td>13228.0</td>\n",
       "      <td>0.8</td>\n",
       "      <td>...</td>\n",
       "      <td>11457.33</td>\n",
       "      <td>5.620000e+09</td>\n",
       "      <td>3.39</td>\n",
       "      <td>-1.710000e+08</td>\n",
       "      <td>106.50</td>\n",
       "      <td>67.0</td>\n",
       "      <td>106.50</td>\n",
       "      <td>4709965.0</td>\n",
       "      <td>1.90</td>\n",
       "      <td>35.9</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>202 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "0           Afghanistan          1          1                          151.0   \n",
       "1               Albania          2          2                           27.0   \n",
       "2               Algeria          3          3                            6.0   \n",
       "3               Andorra          4          2                            NaN   \n",
       "4                Angola          5          3                          146.0   \n",
       "..                  ...        ...        ...                            ...   \n",
       "197             Vietnam        198          6                           25.0   \n",
       "198  West Bank and Gaza        199          1                            NaN   \n",
       "199               Yemen        200          1                           83.0   \n",
       "200              Zambia        201          3                          161.0   \n",
       "201            Zimbabwe        202          3                          101.0   \n",
       "\n",
       "     Adult literacy rate (%)  \\\n",
       "0                       28.0   \n",
       "1                       98.7   \n",
       "2                       69.9   \n",
       "3                        NaN   \n",
       "4                       67.4   \n",
       "..                       ...   \n",
       "197                     90.3   \n",
       "198                      NaN   \n",
       "199                     54.1   \n",
       "200                     68.0   \n",
       "201                     89.5   \n",
       "\n",
       "     Gross national income per capita (PPP international $)  \\\n",
       "0                                                  NaN        \n",
       "1                                               6000.0        \n",
       "2                                               5940.0        \n",
       "3                                                  NaN        \n",
       "4                                               3890.0        \n",
       "..                                                 ...        \n",
       "197                                             2310.0        \n",
       "198                                                NaN        \n",
       "199                                             2090.0        \n",
       "200                                             1140.0        \n",
       "201                                                NaN        \n",
       "\n",
       "     Net primary school enrolment ratio female (%)  \\\n",
       "0                                              NaN   \n",
       "1                                             93.0   \n",
       "2                                             94.0   \n",
       "3                                             83.0   \n",
       "4                                             49.0   \n",
       "..                                             ...   \n",
       "197                                           91.0   \n",
       "198                                            NaN   \n",
       "199                                           65.0   \n",
       "200                                           94.0   \n",
       "201                                           88.0   \n",
       "\n",
       "     Net primary school enrolment ratio male (%)  \\\n",
       "0                                            NaN   \n",
       "1                                           94.0   \n",
       "2                                           96.0   \n",
       "3                                           83.0   \n",
       "4                                           51.0   \n",
       "..                                           ...   \n",
       "197                                         96.0   \n",
       "198                                          NaN   \n",
       "199                                         85.0   \n",
       "200                                         90.0   \n",
       "201                                         87.0   \n",
       "\n",
       "     Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "0                            26088.0                                4.0  ...   \n",
       "1                             3172.0                                0.6  ...   \n",
       "2                            33351.0                                1.5  ...   \n",
       "3                               74.0                                1.0  ...   \n",
       "4                            16557.0                                2.8  ...   \n",
       "..                               ...                                ...  ...   \n",
       "197                          86206.0                                1.4  ...   \n",
       "198                              NaN                                NaN  ...   \n",
       "199                          21732.0                                3.0  ...   \n",
       "200                          11696.0                                1.9  ...   \n",
       "201                          13228.0                                0.8  ...   \n",
       "\n",
       "     Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "0                 692.50           NaN             NaN   \n",
       "1                3499.12  4.790000e+09           78.14   \n",
       "2              137535.56  6.970000e+10          351.36   \n",
       "3                    NaN           NaN             NaN   \n",
       "4                8991.46  1.490000e+10           27.13   \n",
       "..                   ...           ...             ...   \n",
       "197            101826.23  4.480000e+10           47.11   \n",
       "198               655.86  3.780000e+09             NaN   \n",
       "199             20148.34  1.150000e+10          114.52   \n",
       "200              2366.94  4.090000e+09           10.41   \n",
       "201             11457.33  5.620000e+09            3.39   \n",
       "\n",
       "     Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "0                                 NaN                         257.00   \n",
       "1                       -2.040000e+09                          18.47   \n",
       "2                        4.700000e+09                          40.00   \n",
       "3                                 NaN                            NaN   \n",
       "4                        9.140000e+09                         164.10   \n",
       "..                                ...                            ...   \n",
       "197                     -1.940000e+09                          20.20   \n",
       "198                               NaN                          28.00   \n",
       "199                      8.310000e+08                          82.40   \n",
       "200                     -4.470000e+08                         175.30   \n",
       "201                     -1.710000e+08                         106.50   \n",
       "\n",
       "     Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "0                             231.9                     257.00   \n",
       "1                              15.5                      18.47   \n",
       "2                              31.2                      40.00   \n",
       "3                               NaN                        NaN   \n",
       "4                             242.5                     164.10   \n",
       "..                              ...                        ...   \n",
       "197                            23.4                      20.20   \n",
       "198                            25.8                      28.00   \n",
       "199                            87.9                      82.40   \n",
       "200                           163.8                     175.30   \n",
       "201                            67.0                     106.50   \n",
       "\n",
       "     Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "0           5740436.0                     5.44                           22.9  \n",
       "1           1431793.9                     2.21                           45.4  \n",
       "2          20800000.0                     2.61                           63.3  \n",
       "3                 NaN                      NaN                            NaN  \n",
       "4           8578749.0                     4.14                           53.3  \n",
       "..                ...                      ...                            ...  \n",
       "197        21900000.0                     2.90                           26.4  \n",
       "198         2596216.0                     3.33                           71.6  \n",
       "199         5759120.5                     4.37                           27.3  \n",
       "200         4017411.0                     1.95                           35.0  \n",
       "201         4709965.0                     1.90                           35.9  \n",
       "\n",
       "[202 rows x 358 columns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "A continuación se muestra la estructura interna del DataFrame. Se puede ver que és muy parecido a una tabla bidimensional:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>151.0</td>\n",
       "      <td>28.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>26088.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>...</td>\n",
       "      <td>692.50</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>257.00</td>\n",
       "      <td>231.9</td>\n",
       "      <td>257.00</td>\n",
       "      <td>5740436.0</td>\n",
       "      <td>5.44</td>\n",
       "      <td>22.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>27.0</td>\n",
       "      <td>98.7</td>\n",
       "      <td>6000.0</td>\n",
       "      <td>93.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>3172.0</td>\n",
       "      <td>0.6</td>\n",
       "      <td>...</td>\n",
       "      <td>3499.12</td>\n",
       "      <td>4.790000e+09</td>\n",
       "      <td>78.14</td>\n",
       "      <td>-2.040000e+09</td>\n",
       "      <td>18.47</td>\n",
       "      <td>15.5</td>\n",
       "      <td>18.47</td>\n",
       "      <td>1431793.9</td>\n",
       "      <td>2.21</td>\n",
       "      <td>45.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>6.0</td>\n",
       "      <td>69.9</td>\n",
       "      <td>5940.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>96.0</td>\n",
       "      <td>33351.0</td>\n",
       "      <td>1.5</td>\n",
       "      <td>...</td>\n",
       "      <td>137535.56</td>\n",
       "      <td>6.970000e+10</td>\n",
       "      <td>351.36</td>\n",
       "      <td>4.700000e+09</td>\n",
       "      <td>40.00</td>\n",
       "      <td>31.2</td>\n",
       "      <td>40.00</td>\n",
       "      <td>20800000.0</td>\n",
       "      <td>2.61</td>\n",
       "      <td>63.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Andorra</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>83.0</td>\n",
       "      <td>83.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Angola</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>146.0</td>\n",
       "      <td>67.4</td>\n",
       "      <td>3890.0</td>\n",
       "      <td>49.0</td>\n",
       "      <td>51.0</td>\n",
       "      <td>16557.0</td>\n",
       "      <td>2.8</td>\n",
       "      <td>...</td>\n",
       "      <td>8991.46</td>\n",
       "      <td>1.490000e+10</td>\n",
       "      <td>27.13</td>\n",
       "      <td>9.140000e+09</td>\n",
       "      <td>164.10</td>\n",
       "      <td>242.5</td>\n",
       "      <td>164.10</td>\n",
       "      <td>8578749.0</td>\n",
       "      <td>4.14</td>\n",
       "      <td>53.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>197</th>\n",
       "      <td>Vietnam</td>\n",
       "      <td>198</td>\n",
       "      <td>6</td>\n",
       "      <td>25.0</td>\n",
       "      <td>90.3</td>\n",
       "      <td>2310.0</td>\n",
       "      <td>91.0</td>\n",
       "      <td>96.0</td>\n",
       "      <td>86206.0</td>\n",
       "      <td>1.4</td>\n",
       "      <td>...</td>\n",
       "      <td>101826.23</td>\n",
       "      <td>4.480000e+10</td>\n",
       "      <td>47.11</td>\n",
       "      <td>-1.940000e+09</td>\n",
       "      <td>20.20</td>\n",
       "      <td>23.4</td>\n",
       "      <td>20.20</td>\n",
       "      <td>21900000.0</td>\n",
       "      <td>2.90</td>\n",
       "      <td>26.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>198</th>\n",
       "      <td>West Bank and Gaza</td>\n",
       "      <td>199</td>\n",
       "      <td>1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>655.86</td>\n",
       "      <td>3.780000e+09</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>28.00</td>\n",
       "      <td>25.8</td>\n",
       "      <td>28.00</td>\n",
       "      <td>2596216.0</td>\n",
       "      <td>3.33</td>\n",
       "      <td>71.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>199</th>\n",
       "      <td>Yemen</td>\n",
       "      <td>200</td>\n",
       "      <td>1</td>\n",
       "      <td>83.0</td>\n",
       "      <td>54.1</td>\n",
       "      <td>2090.0</td>\n",
       "      <td>65.0</td>\n",
       "      <td>85.0</td>\n",
       "      <td>21732.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>...</td>\n",
       "      <td>20148.34</td>\n",
       "      <td>1.150000e+10</td>\n",
       "      <td>114.52</td>\n",
       "      <td>8.310000e+08</td>\n",
       "      <td>82.40</td>\n",
       "      <td>87.9</td>\n",
       "      <td>82.40</td>\n",
       "      <td>5759120.5</td>\n",
       "      <td>4.37</td>\n",
       "      <td>27.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>200</th>\n",
       "      <td>Zambia</td>\n",
       "      <td>201</td>\n",
       "      <td>3</td>\n",
       "      <td>161.0</td>\n",
       "      <td>68.0</td>\n",
       "      <td>1140.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>11696.0</td>\n",
       "      <td>1.9</td>\n",
       "      <td>...</td>\n",
       "      <td>2366.94</td>\n",
       "      <td>4.090000e+09</td>\n",
       "      <td>10.41</td>\n",
       "      <td>-4.470000e+08</td>\n",
       "      <td>175.30</td>\n",
       "      <td>163.8</td>\n",
       "      <td>175.30</td>\n",
       "      <td>4017411.0</td>\n",
       "      <td>1.95</td>\n",
       "      <td>35.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>201</th>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>202</td>\n",
       "      <td>3</td>\n",
       "      <td>101.0</td>\n",
       "      <td>89.5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>88.0</td>\n",
       "      <td>87.0</td>\n",
       "      <td>13228.0</td>\n",
       "      <td>0.8</td>\n",
       "      <td>...</td>\n",
       "      <td>11457.33</td>\n",
       "      <td>5.620000e+09</td>\n",
       "      <td>3.39</td>\n",
       "      <td>-1.710000e+08</td>\n",
       "      <td>106.50</td>\n",
       "      <td>67.0</td>\n",
       "      <td>106.50</td>\n",
       "      <td>4709965.0</td>\n",
       "      <td>1.90</td>\n",
       "      <td>35.9</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>202 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "0           Afghanistan          1          1                          151.0   \n",
       "1               Albania          2          2                           27.0   \n",
       "2               Algeria          3          3                            6.0   \n",
       "3               Andorra          4          2                            NaN   \n",
       "4                Angola          5          3                          146.0   \n",
       "..                  ...        ...        ...                            ...   \n",
       "197             Vietnam        198          6                           25.0   \n",
       "198  West Bank and Gaza        199          1                            NaN   \n",
       "199               Yemen        200          1                           83.0   \n",
       "200              Zambia        201          3                          161.0   \n",
       "201            Zimbabwe        202          3                          101.0   \n",
       "\n",
       "     Adult literacy rate (%)  \\\n",
       "0                       28.0   \n",
       "1                       98.7   \n",
       "2                       69.9   \n",
       "3                        NaN   \n",
       "4                       67.4   \n",
       "..                       ...   \n",
       "197                     90.3   \n",
       "198                      NaN   \n",
       "199                     54.1   \n",
       "200                     68.0   \n",
       "201                     89.5   \n",
       "\n",
       "     Gross national income per capita (PPP international $)  \\\n",
       "0                                                  NaN        \n",
       "1                                               6000.0        \n",
       "2                                               5940.0        \n",
       "3                                                  NaN        \n",
       "4                                               3890.0        \n",
       "..                                                 ...        \n",
       "197                                             2310.0        \n",
       "198                                                NaN        \n",
       "199                                             2090.0        \n",
       "200                                             1140.0        \n",
       "201                                                NaN        \n",
       "\n",
       "     Net primary school enrolment ratio female (%)  \\\n",
       "0                                              NaN   \n",
       "1                                             93.0   \n",
       "2                                             94.0   \n",
       "3                                             83.0   \n",
       "4                                             49.0   \n",
       "..                                             ...   \n",
       "197                                           91.0   \n",
       "198                                            NaN   \n",
       "199                                           65.0   \n",
       "200                                           94.0   \n",
       "201                                           88.0   \n",
       "\n",
       "     Net primary school enrolment ratio male (%)  \\\n",
       "0                                            NaN   \n",
       "1                                           94.0   \n",
       "2                                           96.0   \n",
       "3                                           83.0   \n",
       "4                                           51.0   \n",
       "..                                           ...   \n",
       "197                                         96.0   \n",
       "198                                          NaN   \n",
       "199                                         85.0   \n",
       "200                                         90.0   \n",
       "201                                         87.0   \n",
       "\n",
       "     Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "0                            26088.0                                4.0  ...   \n",
       "1                             3172.0                                0.6  ...   \n",
       "2                            33351.0                                1.5  ...   \n",
       "3                               74.0                                1.0  ...   \n",
       "4                            16557.0                                2.8  ...   \n",
       "..                               ...                                ...  ...   \n",
       "197                          86206.0                                1.4  ...   \n",
       "198                              NaN                                NaN  ...   \n",
       "199                          21732.0                                3.0  ...   \n",
       "200                          11696.0                                1.9  ...   \n",
       "201                          13228.0                                0.8  ...   \n",
       "\n",
       "     Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "0                 692.50           NaN             NaN   \n",
       "1                3499.12  4.790000e+09           78.14   \n",
       "2              137535.56  6.970000e+10          351.36   \n",
       "3                    NaN           NaN             NaN   \n",
       "4                8991.46  1.490000e+10           27.13   \n",
       "..                   ...           ...             ...   \n",
       "197            101826.23  4.480000e+10           47.11   \n",
       "198               655.86  3.780000e+09             NaN   \n",
       "199             20148.34  1.150000e+10          114.52   \n",
       "200              2366.94  4.090000e+09           10.41   \n",
       "201             11457.33  5.620000e+09            3.39   \n",
       "\n",
       "     Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "0                                 NaN                         257.00   \n",
       "1                       -2.040000e+09                          18.47   \n",
       "2                        4.700000e+09                          40.00   \n",
       "3                                 NaN                            NaN   \n",
       "4                        9.140000e+09                         164.10   \n",
       "..                                ...                            ...   \n",
       "197                     -1.940000e+09                          20.20   \n",
       "198                               NaN                          28.00   \n",
       "199                      8.310000e+08                          82.40   \n",
       "200                     -4.470000e+08                         175.30   \n",
       "201                     -1.710000e+08                         106.50   \n",
       "\n",
       "     Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "0                             231.9                     257.00   \n",
       "1                              15.5                      18.47   \n",
       "2                              31.2                      40.00   \n",
       "3                               NaN                        NaN   \n",
       "4                             242.5                     164.10   \n",
       "..                              ...                        ...   \n",
       "197                            23.4                      20.20   \n",
       "198                            25.8                      28.00   \n",
       "199                            87.9                      82.40   \n",
       "200                           163.8                     175.30   \n",
       "201                            67.0                     106.50   \n",
       "\n",
       "     Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "0           5740436.0                     5.44                           22.9  \n",
       "1           1431793.9                     2.21                           45.4  \n",
       "2          20800000.0                     2.61                           63.3  \n",
       "3                 NaN                      NaN                            NaN  \n",
       "4           8578749.0                     4.14                           53.3  \n",
       "..                ...                      ...                            ...  \n",
       "197        21900000.0                     2.90                           26.4  \n",
       "198         2596216.0                     3.33                           71.6  \n",
       "199         5759120.5                     4.37                           27.3  \n",
       "200         4017411.0                     1.95                           35.0  \n",
       "201         4709965.0                     1.90                           35.9  \n",
       "\n",
       "[202 rows x 358 columns]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "df.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(202, 358)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['Country', 'CountryID', 'Continent', 'Adolescent fertility rate (%)',\n",
       "       'Adult literacy rate (%)',\n",
       "       'Gross national income per capita (PPP international $)',\n",
       "       'Net primary school enrolment ratio female (%)',\n",
       "       'Net primary school enrolment ratio male (%)',\n",
       "       'Population (in thousands) total', 'Population annual growth rate (%)',\n",
       "       ...\n",
       "       'Total_CO2_emissions', 'Total_income', 'Total_reserves',\n",
       "       'Trade_balance_goods_and_services', 'Under_five_mortality_from_CME',\n",
       "       'Under_five_mortality_from_IHME', 'Under_five_mortality_rate',\n",
       "       'Urban_population', 'Urban_population_growth',\n",
       "       'Urban_population_pct_of_total'],\n",
       "      dtype='object', length=358)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RangeIndex(start=0, stop=202, step=1)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### Atributos de un DataFrame\n",
    "\n",
    "Un dataframe dispone de diferentes atributos con los que podemos obtener su información o metainformación. Los siguientes ejemplos muestran cómo se pueden consultar sus dimensiones o un listado del nombre de sus columnas:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(202, 358)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.shape # Ver las dimensiones **"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['Country', 'CountryID', 'Continent', 'Adolescent fertility rate (%)',\n",
       "       'Adult literacy rate (%)'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.columns[:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Podemos aplicar sobre el listado de columnas todas las operaciones sobre listas que hemos visto en la introducción del curso. A continuación tenemos dos ejemplos de indexación:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Urban_population_growth'"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.columns[-2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['Country', 'CountryID', 'Continent', 'Adolescent fertility rate (%)',\n",
       "       'Adult literacy rate (%)'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.columns[0:5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['Under_five_mortality_from_IHME', 'Under_five_mortality_rate',\n",
       "       'Urban_population', 'Urban_population_growth'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.columns[-5:-1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "**¿Cómo consultariais el nombre de la columna 10? ¿y los de las columnas de la 200 a la 225?**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "358"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(df.columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['Urban_population', 'Urban_population_growth',\n",
       "       'Urban_population_pct_of_total'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.columns[200:226]\n",
    "\n",
    "df.columns[len(df.columns)-3:len(df.columns)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### Funciones descriptivas de un dataframe\n",
    "\n",
    "`Pandas` ofrece una colección de funciones que permiten realizar una inspección general de la tabla de datos:\n",
    "\n",
    "- **describe**: muestra estadísticas descriptivas básicas para todas las columnas numéricas.\n",
    "- **info**: muestra todas las columnas y sus tipos de datos.\n",
    "- **head** i **tail**: muestra las $n$ primeras/últimas filas. El valor de $n$ es un parámetro de este método.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>Population in urban areas (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>202.000000</td>\n",
       "      <td>202.000000</td>\n",
       "      <td>177.000000</td>\n",
       "      <td>131.000000</td>\n",
       "      <td>178.000000</td>\n",
       "      <td>179.000000</td>\n",
       "      <td>179.000000</td>\n",
       "      <td>1.930000e+02</td>\n",
       "      <td>193.000000</td>\n",
       "      <td>193.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>1.860000e+02</td>\n",
       "      <td>1.780000e+02</td>\n",
       "      <td>128.000000</td>\n",
       "      <td>1.710000e+02</td>\n",
       "      <td>181.000000</td>\n",
       "      <td>170.000000</td>\n",
       "      <td>181.000000</td>\n",
       "      <td>1.880000e+02</td>\n",
       "      <td>188.000000</td>\n",
       "      <td>188.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>101.500000</td>\n",
       "      <td>3.579208</td>\n",
       "      <td>59.457627</td>\n",
       "      <td>78.871756</td>\n",
       "      <td>11250.112360</td>\n",
       "      <td>84.033520</td>\n",
       "      <td>85.698324</td>\n",
       "      <td>3.409805e+04</td>\n",
       "      <td>1.297927</td>\n",
       "      <td>54.911917</td>\n",
       "      <td>...</td>\n",
       "      <td>1.483596e+05</td>\n",
       "      <td>2.015567e+11</td>\n",
       "      <td>57.253516</td>\n",
       "      <td>3.424012e+08</td>\n",
       "      <td>56.677624</td>\n",
       "      <td>54.356471</td>\n",
       "      <td>56.677624</td>\n",
       "      <td>1.665763e+07</td>\n",
       "      <td>2.165851</td>\n",
       "      <td>55.195213</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>58.456537</td>\n",
       "      <td>1.808263</td>\n",
       "      <td>49.105286</td>\n",
       "      <td>20.415760</td>\n",
       "      <td>12586.753417</td>\n",
       "      <td>17.788047</td>\n",
       "      <td>15.451212</td>\n",
       "      <td>1.304957e+05</td>\n",
       "      <td>1.163864</td>\n",
       "      <td>23.554182</td>\n",
       "      <td>...</td>\n",
       "      <td>6.133091e+05</td>\n",
       "      <td>9.400689e+11</td>\n",
       "      <td>138.669298</td>\n",
       "      <td>5.943043e+10</td>\n",
       "      <td>60.060929</td>\n",
       "      <td>61.160556</td>\n",
       "      <td>60.060929</td>\n",
       "      <td>5.094867e+07</td>\n",
       "      <td>1.596628</td>\n",
       "      <td>23.742122</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>23.600000</td>\n",
       "      <td>260.000000</td>\n",
       "      <td>6.000000</td>\n",
       "      <td>11.000000</td>\n",
       "      <td>2.000000e+00</td>\n",
       "      <td>-2.500000</td>\n",
       "      <td>10.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>2.565000e+01</td>\n",
       "      <td>5.190000e+07</td>\n",
       "      <td>0.990000</td>\n",
       "      <td>-7.140000e+11</td>\n",
       "      <td>2.900000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>2.900000</td>\n",
       "      <td>1.545600e+04</td>\n",
       "      <td>-1.160000</td>\n",
       "      <td>10.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>51.250000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>19.000000</td>\n",
       "      <td>68.400000</td>\n",
       "      <td>2112.500000</td>\n",
       "      <td>79.000000</td>\n",
       "      <td>79.500000</td>\n",
       "      <td>1.340000e+03</td>\n",
       "      <td>0.500000</td>\n",
       "      <td>36.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>1.672615e+03</td>\n",
       "      <td>3.317500e+09</td>\n",
       "      <td>16.292500</td>\n",
       "      <td>-1.210000e+09</td>\n",
       "      <td>12.400000</td>\n",
       "      <td>8.475000</td>\n",
       "      <td>12.400000</td>\n",
       "      <td>9.171623e+05</td>\n",
       "      <td>1.105000</td>\n",
       "      <td>35.650000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>101.500000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>46.000000</td>\n",
       "      <td>86.500000</td>\n",
       "      <td>6175.000000</td>\n",
       "      <td>90.000000</td>\n",
       "      <td>90.000000</td>\n",
       "      <td>6.762000e+03</td>\n",
       "      <td>1.300000</td>\n",
       "      <td>57.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>1.021157e+04</td>\n",
       "      <td>1.145000e+10</td>\n",
       "      <td>28.515000</td>\n",
       "      <td>-2.240000e+08</td>\n",
       "      <td>29.980000</td>\n",
       "      <td>27.600000</td>\n",
       "      <td>29.980000</td>\n",
       "      <td>3.427661e+06</td>\n",
       "      <td>1.945000</td>\n",
       "      <td>57.300000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>151.750000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>91.000000</td>\n",
       "      <td>95.300000</td>\n",
       "      <td>14502.500000</td>\n",
       "      <td>96.000000</td>\n",
       "      <td>96.000000</td>\n",
       "      <td>2.173200e+04</td>\n",
       "      <td>2.100000</td>\n",
       "      <td>73.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>6.549217e+04</td>\n",
       "      <td>8.680000e+10</td>\n",
       "      <td>55.310000</td>\n",
       "      <td>1.024000e+09</td>\n",
       "      <td>88.700000</td>\n",
       "      <td>82.900000</td>\n",
       "      <td>88.700000</td>\n",
       "      <td>9.837113e+06</td>\n",
       "      <td>3.252500</td>\n",
       "      <td>72.750000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>202.000000</td>\n",
       "      <td>7.000000</td>\n",
       "      <td>199.000000</td>\n",
       "      <td>99.800000</td>\n",
       "      <td>60870.000000</td>\n",
       "      <td>100.000000</td>\n",
       "      <td>100.000000</td>\n",
       "      <td>1.328474e+06</td>\n",
       "      <td>4.300000</td>\n",
       "      <td>100.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>5.776432e+06</td>\n",
       "      <td>1.100000e+13</td>\n",
       "      <td>1334.860000</td>\n",
       "      <td>1.390000e+11</td>\n",
       "      <td>267.000000</td>\n",
       "      <td>253.700000</td>\n",
       "      <td>267.000000</td>\n",
       "      <td>5.270000e+08</td>\n",
       "      <td>7.850000</td>\n",
       "      <td>100.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>8 rows × 357 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        CountryID   Continent  Adolescent fertility rate (%)  \\\n",
       "count  202.000000  202.000000                     177.000000   \n",
       "mean   101.500000    3.579208                      59.457627   \n",
       "std     58.456537    1.808263                      49.105286   \n",
       "min      1.000000    1.000000                       0.000000   \n",
       "25%     51.250000    2.000000                      19.000000   \n",
       "50%    101.500000    3.000000                      46.000000   \n",
       "75%    151.750000    5.000000                      91.000000   \n",
       "max    202.000000    7.000000                     199.000000   \n",
       "\n",
       "       Adult literacy rate (%)  \\\n",
       "count               131.000000   \n",
       "mean                 78.871756   \n",
       "std                  20.415760   \n",
       "min                  23.600000   \n",
       "25%                  68.400000   \n",
       "50%                  86.500000   \n",
       "75%                  95.300000   \n",
       "max                  99.800000   \n",
       "\n",
       "       Gross national income per capita (PPP international $)  \\\n",
       "count                                         178.000000        \n",
       "mean                                        11250.112360        \n",
       "std                                         12586.753417        \n",
       "min                                           260.000000        \n",
       "25%                                          2112.500000        \n",
       "50%                                          6175.000000        \n",
       "75%                                         14502.500000        \n",
       "max                                         60870.000000        \n",
       "\n",
       "       Net primary school enrolment ratio female (%)  \\\n",
       "count                                     179.000000   \n",
       "mean                                       84.033520   \n",
       "std                                        17.788047   \n",
       "min                                         6.000000   \n",
       "25%                                        79.000000   \n",
       "50%                                        90.000000   \n",
       "75%                                        96.000000   \n",
       "max                                       100.000000   \n",
       "\n",
       "       Net primary school enrolment ratio male (%)  \\\n",
       "count                                   179.000000   \n",
       "mean                                     85.698324   \n",
       "std                                      15.451212   \n",
       "min                                      11.000000   \n",
       "25%                                      79.500000   \n",
       "50%                                      90.000000   \n",
       "75%                                      96.000000   \n",
       "max                                     100.000000   \n",
       "\n",
       "       Population (in thousands) total  Population annual growth rate (%)  \\\n",
       "count                     1.930000e+02                         193.000000   \n",
       "mean                      3.409805e+04                           1.297927   \n",
       "std                       1.304957e+05                           1.163864   \n",
       "min                       2.000000e+00                          -2.500000   \n",
       "25%                       1.340000e+03                           0.500000   \n",
       "50%                       6.762000e+03                           1.300000   \n",
       "75%                       2.173200e+04                           2.100000   \n",
       "max                       1.328474e+06                           4.300000   \n",
       "\n",
       "       Population in urban areas (%)  ...  Total_CO2_emissions  Total_income  \\\n",
       "count                     193.000000  ...         1.860000e+02  1.780000e+02   \n",
       "mean                       54.911917  ...         1.483596e+05  2.015567e+11   \n",
       "std                        23.554182  ...         6.133091e+05  9.400689e+11   \n",
       "min                        10.000000  ...         2.565000e+01  5.190000e+07   \n",
       "25%                        36.000000  ...         1.672615e+03  3.317500e+09   \n",
       "50%                        57.000000  ...         1.021157e+04  1.145000e+10   \n",
       "75%                        73.000000  ...         6.549217e+04  8.680000e+10   \n",
       "max                       100.000000  ...         5.776432e+06  1.100000e+13   \n",
       "\n",
       "       Total_reserves  Trade_balance_goods_and_services  \\\n",
       "count      128.000000                      1.710000e+02   \n",
       "mean        57.253516                      3.424012e+08   \n",
       "std        138.669298                      5.943043e+10   \n",
       "min          0.990000                     -7.140000e+11   \n",
       "25%         16.292500                     -1.210000e+09   \n",
       "50%         28.515000                     -2.240000e+08   \n",
       "75%         55.310000                      1.024000e+09   \n",
       "max       1334.860000                      1.390000e+11   \n",
       "\n",
       "       Under_five_mortality_from_CME  Under_five_mortality_from_IHME  \\\n",
       "count                     181.000000                      170.000000   \n",
       "mean                       56.677624                       54.356471   \n",
       "std                        60.060929                       61.160556   \n",
       "min                         2.900000                        3.000000   \n",
       "25%                        12.400000                        8.475000   \n",
       "50%                        29.980000                       27.600000   \n",
       "75%                        88.700000                       82.900000   \n",
       "max                       267.000000                      253.700000   \n",
       "\n",
       "       Under_five_mortality_rate  Urban_population  Urban_population_growth  \\\n",
       "count                 181.000000      1.880000e+02               188.000000   \n",
       "mean                   56.677624      1.665763e+07                 2.165851   \n",
       "std                    60.060929      5.094867e+07                 1.596628   \n",
       "min                     2.900000      1.545600e+04                -1.160000   \n",
       "25%                    12.400000      9.171623e+05                 1.105000   \n",
       "50%                    29.980000      3.427661e+06                 1.945000   \n",
       "75%                    88.700000      9.837113e+06                 3.252500   \n",
       "max                   267.000000      5.270000e+08                 7.850000   \n",
       "\n",
       "       Urban_population_pct_of_total  \n",
       "count                     188.000000  \n",
       "mean                       55.195213  \n",
       "std                        23.742122  \n",
       "min                        10.000000  \n",
       "25%                        35.650000  \n",
       "50%                        57.300000  \n",
       "75%                        72.750000  \n",
       "max                       100.000000  \n",
       "\n",
       "[8 rows x 357 columns]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.describe()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 202 entries, 0 to 201\n",
      "Columns: 358 entries, Country to Urban_population_pct_of_total\n",
      "dtypes: float64(355), int64(2), object(1)\n",
      "memory usage: 565.1+ KB\n"
     ]
    }
   ],
   "source": [
    "df.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>151.0</td>\n",
       "      <td>28.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>26088.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>...</td>\n",
       "      <td>692.50</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>257.00</td>\n",
       "      <td>231.9</td>\n",
       "      <td>257.00</td>\n",
       "      <td>5740436.0</td>\n",
       "      <td>5.44</td>\n",
       "      <td>22.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>27.0</td>\n",
       "      <td>98.7</td>\n",
       "      <td>6000.0</td>\n",
       "      <td>93.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>3172.0</td>\n",
       "      <td>0.6</td>\n",
       "      <td>...</td>\n",
       "      <td>3499.12</td>\n",
       "      <td>4.790000e+09</td>\n",
       "      <td>78.14</td>\n",
       "      <td>-2.040000e+09</td>\n",
       "      <td>18.47</td>\n",
       "      <td>15.5</td>\n",
       "      <td>18.47</td>\n",
       "      <td>1431793.9</td>\n",
       "      <td>2.21</td>\n",
       "      <td>45.4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>2 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "0  Afghanistan          1          1                          151.0   \n",
       "1      Albania          2          2                           27.0   \n",
       "\n",
       "   Adult literacy rate (%)  \\\n",
       "0                     28.0   \n",
       "1                     98.7   \n",
       "\n",
       "   Gross national income per capita (PPP international $)  \\\n",
       "0                                                NaN        \n",
       "1                                             6000.0        \n",
       "\n",
       "   Net primary school enrolment ratio female (%)  \\\n",
       "0                                            NaN   \n",
       "1                                           93.0   \n",
       "\n",
       "   Net primary school enrolment ratio male (%)  \\\n",
       "0                                          NaN   \n",
       "1                                         94.0   \n",
       "\n",
       "   Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "0                          26088.0                                4.0  ...   \n",
       "1                           3172.0                                0.6  ...   \n",
       "\n",
       "   Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "0               692.50           NaN             NaN   \n",
       "1              3499.12  4.790000e+09           78.14   \n",
       "\n",
       "   Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "0                               NaN                         257.00   \n",
       "1                     -2.040000e+09                          18.47   \n",
       "\n",
       "   Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "0                           231.9                     257.00   \n",
       "1                            15.5                      18.47   \n",
       "\n",
       "   Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "0         5740436.0                     5.44                           22.9  \n",
       "1         1431793.9                     2.21                           45.4  \n",
       "\n",
       "[2 rows x 358 columns]"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>200</th>\n",
       "      <td>Zambia</td>\n",
       "      <td>201</td>\n",
       "      <td>3</td>\n",
       "      <td>161.0</td>\n",
       "      <td>68.0</td>\n",
       "      <td>1140.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>11696.0</td>\n",
       "      <td>1.9</td>\n",
       "      <td>...</td>\n",
       "      <td>2366.94</td>\n",
       "      <td>4.090000e+09</td>\n",
       "      <td>10.41</td>\n",
       "      <td>-447000000.0</td>\n",
       "      <td>175.3</td>\n",
       "      <td>163.8</td>\n",
       "      <td>175.3</td>\n",
       "      <td>4017411.0</td>\n",
       "      <td>1.95</td>\n",
       "      <td>35.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>201</th>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>202</td>\n",
       "      <td>3</td>\n",
       "      <td>101.0</td>\n",
       "      <td>89.5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>88.0</td>\n",
       "      <td>87.0</td>\n",
       "      <td>13228.0</td>\n",
       "      <td>0.8</td>\n",
       "      <td>...</td>\n",
       "      <td>11457.33</td>\n",
       "      <td>5.620000e+09</td>\n",
       "      <td>3.39</td>\n",
       "      <td>-171000000.0</td>\n",
       "      <td>106.5</td>\n",
       "      <td>67.0</td>\n",
       "      <td>106.5</td>\n",
       "      <td>4709965.0</td>\n",
       "      <td>1.90</td>\n",
       "      <td>35.9</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>2 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "200    Zambia        201          3                          161.0   \n",
       "201  Zimbabwe        202          3                          101.0   \n",
       "\n",
       "     Adult literacy rate (%)  \\\n",
       "200                     68.0   \n",
       "201                     89.5   \n",
       "\n",
       "     Gross national income per capita (PPP international $)  \\\n",
       "200                                             1140.0        \n",
       "201                                                NaN        \n",
       "\n",
       "     Net primary school enrolment ratio female (%)  \\\n",
       "200                                           94.0   \n",
       "201                                           88.0   \n",
       "\n",
       "     Net primary school enrolment ratio male (%)  \\\n",
       "200                                         90.0   \n",
       "201                                         87.0   \n",
       "\n",
       "     Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "200                          11696.0                                1.9  ...   \n",
       "201                          13228.0                                0.8  ...   \n",
       "\n",
       "     Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "200              2366.94  4.090000e+09           10.41   \n",
       "201             11457.33  5.620000e+09            3.39   \n",
       "\n",
       "     Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "200                      -447000000.0                          175.3   \n",
       "201                      -171000000.0                          106.5   \n",
       "\n",
       "     Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "200                           163.8                      175.3   \n",
       "201                            67.0                      106.5   \n",
       "\n",
       "     Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "200         4017411.0                     1.95                           35.0  \n",
       "201         4709965.0                     1.90                           35.9  \n",
       "\n",
       "[2 rows x 358 columns]"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.tail(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>200</th>\n",
       "      <td>Zambia</td>\n",
       "      <td>201</td>\n",
       "      <td>3</td>\n",
       "      <td>161.0</td>\n",
       "      <td>68.0</td>\n",
       "      <td>1140.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>11696.0</td>\n",
       "      <td>1.9</td>\n",
       "      <td>...</td>\n",
       "      <td>2366.94</td>\n",
       "      <td>4.090000e+09</td>\n",
       "      <td>10.41</td>\n",
       "      <td>-447000000.0</td>\n",
       "      <td>175.3</td>\n",
       "      <td>163.8</td>\n",
       "      <td>175.3</td>\n",
       "      <td>4017411.0</td>\n",
       "      <td>1.95</td>\n",
       "      <td>35.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>201</th>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>202</td>\n",
       "      <td>3</td>\n",
       "      <td>101.0</td>\n",
       "      <td>89.5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>88.0</td>\n",
       "      <td>87.0</td>\n",
       "      <td>13228.0</td>\n",
       "      <td>0.8</td>\n",
       "      <td>...</td>\n",
       "      <td>11457.33</td>\n",
       "      <td>5.620000e+09</td>\n",
       "      <td>3.39</td>\n",
       "      <td>-171000000.0</td>\n",
       "      <td>106.5</td>\n",
       "      <td>67.0</td>\n",
       "      <td>106.5</td>\n",
       "      <td>4709965.0</td>\n",
       "      <td>1.90</td>\n",
       "      <td>35.9</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>2 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "200    Zambia        201          3                          161.0   \n",
       "201  Zimbabwe        202          3                          101.0   \n",
       "\n",
       "     Adult literacy rate (%)  \\\n",
       "200                     68.0   \n",
       "201                     89.5   \n",
       "\n",
       "     Gross national income per capita (PPP international $)  \\\n",
       "200                                             1140.0        \n",
       "201                                                NaN        \n",
       "\n",
       "     Net primary school enrolment ratio female (%)  \\\n",
       "200                                           94.0   \n",
       "201                                           88.0   \n",
       "\n",
       "     Net primary school enrolment ratio male (%)  \\\n",
       "200                                         90.0   \n",
       "201                                         87.0   \n",
       "\n",
       "     Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "200                          11696.0                                1.9  ...   \n",
       "201                          13228.0                                0.8  ...   \n",
       "\n",
       "     Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "200              2366.94  4.090000e+09           10.41   \n",
       "201             11457.33  5.620000e+09            3.39   \n",
       "\n",
       "     Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "200                      -447000000.0                          175.3   \n",
       "201                      -171000000.0                          106.5   \n",
       "\n",
       "     Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "200                           163.8                      175.3   \n",
       "201                            67.0                      106.5   \n",
       "\n",
       "     Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "200         4017411.0                     1.95                           35.0  \n",
       "201         4709965.0                     1.90                           35.9  \n",
       "\n",
       "[2 rows x 358 columns]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.tail(2)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Carga de datos (segunda parte)\n",
    "\n",
    "Desafortunadamente, la estructura y codificación de los datos en los archivos CSV varía según la herramienta o el sistema operativo. Por lo tanto, podemos encontrarnos con separadores entre columnas que no sean la típica coma (',') o formatos de codificación de texto que no sean abiertos (por ejemplo, utf-8, ansi, etc.).\n",
    "\n",
    "Es por esto que la función `read_csv` es muy versátil. Puedes consultar su [documentación](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html).\n",
    "\n",
    "Vamos a ver qué sucede cuando se obtienen datos de una administración pública. Puedes encontrar el archivo disponible en: 'data/presupuesto_gastos_2023.csv'. Puedes acceder a estos datos a través del siguiente [enlace](https://datos.gob.es/es/catalogo?q=bilbao&g-recaptcha-response=&administration_level=L&theme_id=economia&sort=score+desc%2C+metadata_created+desc)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_gastos = pd.read_csv(\"data/presupuesto_gastos_2023.csv\",encoding=\"cp1250\",sep=\";\") # quins errors genera ?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>_id</th>\n",
       "      <th>SOZIETATEA_EU/SOCIEDAD_EU</th>\n",
       "      <th>SOZIETATEA_CAS/SOCIEDAD_CAS</th>\n",
       "      <th>EKITALDIA/EJERCICIO</th>\n",
       "      <th>SAILA/DEPARTAMENTO</th>\n",
       "      <th>SAILAREN DESKRIBAPENA_EU/DESCRIPCION DEPARTAMENTO_EU</th>\n",
       "      <th>SAILAREN DESKRIBAPENA_EU/DESCRIPCION DEPARTAMENTO_CAS</th>\n",
       "      <th>ZENTRU KUDEATZAILEA/CENTRO GESTOR</th>\n",
       "      <th>ZENTRO KUDEATZAILEAREN DESKR._EUS/DESCR. CENTRO GESTOR_EUS</th>\n",
       "      <th>ZENTRO KUDEATZAILEAREN DESKR._CAS/DESCR. CENTRO GESTOR_CAS</th>\n",
       "      <th>...</th>\n",
       "      <th>ARTIKULUAREN DESKRIBAPENA_CAS/DESCRIPCION ARTICULO_CAS</th>\n",
       "      <th>KONTZEPTUA/CONCEPTO</th>\n",
       "      <th>KONTZEPTUAREN DESKRIBAPENA_EUS/DESCRIPCION CONCEPTO_EUS</th>\n",
       "      <th>KONTZEPTUAREN DESKRIBAPENA_CAS/DESCRIPCION CONCEPTO_CAS</th>\n",
       "      <th>AZPIKONTZEPTUA/SUBCONCEPTO</th>\n",
       "      <th>AZPIKONTZEPTUAREN DESKRIBAPENA_EUS/DESCRIPCION SUBCONCEPTO_EUS</th>\n",
       "      <th>AZPIKONTZEPTUAREN DESKRIBAPENA_CAS/DESCRIPCION SUBCONCEPTO_CAS</th>\n",
       "      <th>PROIEKTUA/PROYECTO</th>\n",
       "      <th>PROIEKTUAREN DESKRIBAPENA/DESCRIPCION PROYECTO</th>\n",
       "      <th>HASIERAKO KREDITUA/CREDITO INICIAL 2023</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>2023</td>\n",
       "      <td>110</td>\n",
       "      <td>KULTURA ETA GOBERNANTZA</td>\n",
       "      <td>CULTURA Y GOBERNANZA</td>\n",
       "      <td>1100</td>\n",
       "      <td>KULTURA ETA GOBERNANTZA</td>\n",
       "      <td>CULTURA Y GOBERNANZA</td>\n",
       "      <td>...</td>\n",
       "      <td>RETRIBUCIONES DE ALTOS CARGOS</td>\n",
       "      <td>100</td>\n",
       "      <td>GOI KARGUEN OINARRIZKO SOLDATAK ETA BESTELAKO ...</td>\n",
       "      <td>RETRIBUCIONES BASICAS Y OTRAS REMUNERACIONES D...</td>\n",
       "      <td>10001</td>\n",
       "      <td>GOI KARGUEN OINARRIZKO SOLDATAK ETA BESTELAKO ...</td>\n",
       "      <td>RETRIBUCIONES BASICAS Y OTRAS REMUNERACIONES D...</td>\n",
       "      <td>9999/99999</td>\n",
       "      <td>GENERIKOA/ GENÉRICO</td>\n",
       "      <td>220.619,00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>2023</td>\n",
       "      <td>110</td>\n",
       "      <td>KULTURA ETA GOBERNANTZA</td>\n",
       "      <td>CULTURA Y GOBERNANZA</td>\n",
       "      <td>1100</td>\n",
       "      <td>KULTURA ETA GOBERNANTZA</td>\n",
       "      <td>CULTURA Y GOBERNANZA</td>\n",
       "      <td>...</td>\n",
       "      <td>RETRIBUCIONES DEL PERSONAL EVENTUAL DE GABINETES</td>\n",
       "      <td>110</td>\n",
       "      <td>KABINETEETAKO ALDI BAT. PERTSONALAREN OINARRIZ...</td>\n",
       "      <td>RETRIBUCIONES BASICAS Y OTRAS REMUNERACIONES D...</td>\n",
       "      <td>11001</td>\n",
       "      <td>KABINETEETAKO ALDI BAT. PERTSONALAREN OINARRIZ...</td>\n",
       "      <td>RETRIBUCIONES BASICAS Y OTRAS REMUNERACIONES D...</td>\n",
       "      <td>9999/99999</td>\n",
       "      <td>GENERIKOA/ GENÉRICO</td>\n",
       "      <td>589.261,00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>2023</td>\n",
       "      <td>110</td>\n",
       "      <td>KULTURA ETA GOBERNANTZA</td>\n",
       "      <td>CULTURA Y GOBERNANZA</td>\n",
       "      <td>1100</td>\n",
       "      <td>KULTURA ETA GOBERNANTZA</td>\n",
       "      <td>CULTURA Y GOBERNANZA</td>\n",
       "      <td>...</td>\n",
       "      <td>RETRIBUCIONES DEL PERSONAL FUNCIONARIO</td>\n",
       "      <td>120</td>\n",
       "      <td>FUNTZIONARIOEN OINARRIZKO SOLDATAK</td>\n",
       "      <td>RETRIBUCIONES BASICAS DEL PERSONAL FUNCIONARIO</td>\n",
       "      <td>12001</td>\n",
       "      <td>FUNTZIONARIOEN OINARRIZKO SOLDATAK</td>\n",
       "      <td>RETRIBUCIONES BASICAS DEL PERSONAL FUNCIONARIO</td>\n",
       "      <td>9999/99999</td>\n",
       "      <td>GENERIKOA/ GENÉRICO</td>\n",
       "      <td>383.369,00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>2023</td>\n",
       "      <td>110</td>\n",
       "      <td>KULTURA ETA GOBERNANTZA</td>\n",
       "      <td>CULTURA Y GOBERNANZA</td>\n",
       "      <td>1100</td>\n",
       "      <td>KULTURA ETA GOBERNANTZA</td>\n",
       "      <td>CULTURA Y GOBERNANZA</td>\n",
       "      <td>...</td>\n",
       "      <td>RETRIBUCIONES DEL PERSONAL FUNCIONARIO</td>\n",
       "      <td>121</td>\n",
       "      <td>FUNTZIONARIOEN ORDAINSARI OSAGARRIAK</td>\n",
       "      <td>RETRIBUCIONES COMPLEMENTARIAS DEL PERSONAL FUN...</td>\n",
       "      <td>12101</td>\n",
       "      <td>LANTOKI OSAGARRIA</td>\n",
       "      <td>COMPLEMENTO DE DESTINO</td>\n",
       "      <td>9999/99999</td>\n",
       "      <td>GENERIKOA/ GENÉRICO</td>\n",
       "      <td>131.722,00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>2023</td>\n",
       "      <td>110</td>\n",
       "      <td>KULTURA ETA GOBERNANTZA</td>\n",
       "      <td>CULTURA Y GOBERNANZA</td>\n",
       "      <td>1100</td>\n",
       "      <td>KULTURA ETA GOBERNANTZA</td>\n",
       "      <td>CULTURA Y GOBERNANZA</td>\n",
       "      <td>...</td>\n",
       "      <td>RETRIBUCIONES DEL PERSONAL FUNCIONARIO</td>\n",
       "      <td>121</td>\n",
       "      <td>FUNTZIONARIOEN ORDAINSARI OSAGARRIAK</td>\n",
       "      <td>RETRIBUCIONES COMPLEMENTARIAS DEL PERSONAL FUN...</td>\n",
       "      <td>12102</td>\n",
       "      <td>OSAGARRI BEREZIA</td>\n",
       "      <td>COMPLEMENTO ESPECIFICO</td>\n",
       "      <td>9999/99999</td>\n",
       "      <td>GENERIKOA/ GENÉRICO</td>\n",
       "      <td>396.609,00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2088</th>\n",
       "      <td>2089</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>2023</td>\n",
       "      <td>A10</td>\n",
       "      <td>ALKATETZAKO KABINETEA</td>\n",
       "      <td>GABINETE DE ALCALDÍA</td>\n",
       "      <td>A101</td>\n",
       "      <td>PROTOKOLOKO ETA HARREMAN PUBLIKOEN BULEGOA</td>\n",
       "      <td>OFICINA PROTOCOLO, RR.PP. E NSTITUCIONALES</td>\n",
       "      <td>...</td>\n",
       "      <td>MATERIAL, SUMINISTROS Y OTROS</td>\n",
       "      <td>227</td>\n",
       "      <td>KANPOKO ENPRESEK EGINDAKO LANAK</td>\n",
       "      <td>TRABAJOS REALIZADOS POR OTRAS EMPRESAS EXTERNAS</td>\n",
       "      <td>22717</td>\n",
       "      <td>BIDAI AGENTZIETAKO ZERBITZUAK</td>\n",
       "      <td>SERVICIOS DE AGENCIAS DE VIAJES</td>\n",
       "      <td>2006/00184</td>\n",
       "      <td>Harreman Publikoetako Kabineteari lotutako gas...</td>\n",
       "      <td>500,00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2089</th>\n",
       "      <td>2090</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>2023</td>\n",
       "      <td>A10</td>\n",
       "      <td>ALKATETZAKO KABINETEA</td>\n",
       "      <td>GABINETE DE ALCALDÍA</td>\n",
       "      <td>A101</td>\n",
       "      <td>PROTOKOLOKO ETA HARREMAN PUBLIKOEN BULEGOA</td>\n",
       "      <td>OFICINA PROTOCOLO, RR.PP. E NSTITUCIONALES</td>\n",
       "      <td>...</td>\n",
       "      <td>MATERIAL, SUMINISTROS Y OTROS</td>\n",
       "      <td>227</td>\n",
       "      <td>KANPOKO ENPRESEK EGINDAKO LANAK</td>\n",
       "      <td>TRABAJOS REALIZADOS POR OTRAS EMPRESAS EXTERNAS</td>\n",
       "      <td>22720</td>\n",
       "      <td>ITZULPEN ETA INTERPRETARITZA ZERBITZUAK</td>\n",
       "      <td>SERVICIOS DE TRADUCCION E INTERPRETES</td>\n",
       "      <td>2006/00182</td>\n",
       "      <td>Protokoloko eta ordezkaritzako arreta/Harreman...</td>\n",
       "      <td>2.000,00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2090</th>\n",
       "      <td>2091</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>2023</td>\n",
       "      <td>A10</td>\n",
       "      <td>ALKATETZAKO KABINETEA</td>\n",
       "      <td>GABINETE DE ALCALDÍA</td>\n",
       "      <td>A101</td>\n",
       "      <td>PROTOKOLOKO ETA HARREMAN PUBLIKOEN BULEGOA</td>\n",
       "      <td>OFICINA PROTOCOLO, RR.PP. E NSTITUCIONALES</td>\n",
       "      <td>...</td>\n",
       "      <td>MATERIAL, SUMINISTROS Y OTROS</td>\n",
       "      <td>227</td>\n",
       "      <td>KANPOKO ENPRESEK EGINDAKO LANAK</td>\n",
       "      <td>TRABAJOS REALIZADOS POR OTRAS EMPRESAS EXTERNAS</td>\n",
       "      <td>22723</td>\n",
       "      <td>ZERBITZUAK:ARGIAK ETA SOINUAK</td>\n",
       "      <td>SERVICIOS DE ILUMINACIÓN Y SONIDO</td>\n",
       "      <td>2006/00182</td>\n",
       "      <td>Protokoloko eta ordezkaritzako arreta/Harreman...</td>\n",
       "      <td>10.000,00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2091</th>\n",
       "      <td>2092</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>2023</td>\n",
       "      <td>A10</td>\n",
       "      <td>ALKATETZAKO KABINETEA</td>\n",
       "      <td>GABINETE DE ALCALDÍA</td>\n",
       "      <td>A101</td>\n",
       "      <td>PROTOKOLOKO ETA HARREMAN PUBLIKOEN BULEGOA</td>\n",
       "      <td>OFICINA PROTOCOLO, RR.PP. E NSTITUCIONALES</td>\n",
       "      <td>...</td>\n",
       "      <td>MATERIAL, SUMINISTROS Y OTROS</td>\n",
       "      <td>227</td>\n",
       "      <td>KANPOKO ENPRESEK EGINDAKO LANAK</td>\n",
       "      <td>TRABAJOS REALIZADOS POR OTRAS EMPRESAS EXTERNAS</td>\n",
       "      <td>22799</td>\n",
       "      <td>KANPOKO ENPRESEK EGINDAKO BESTELAKO LANAK</td>\n",
       "      <td>OTROS TRABAJOS REALIZADOS POR EMPRESAS EXTERNAS</td>\n",
       "      <td>2006/00182</td>\n",
       "      <td>Protokoloko eta ordezkaritzako arreta/Harreman...</td>\n",
       "      <td>48.400,00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2092</th>\n",
       "      <td>2093</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>UDAL</td>\n",
       "      <td>2023</td>\n",
       "      <td>A10</td>\n",
       "      <td>ALKATETZAKO KABINETEA</td>\n",
       "      <td>GABINETE DE ALCALDÍA</td>\n",
       "      <td>A101</td>\n",
       "      <td>PROTOKOLOKO ETA HARREMAN PUBLIKOEN BULEGOA</td>\n",
       "      <td>OFICINA PROTOCOLO, RR.PP. E NSTITUCIONALES</td>\n",
       "      <td>...</td>\n",
       "      <td>INDEMNIZACIONES POR RAZON DEL SERVICIO</td>\n",
       "      <td>230</td>\n",
       "      <td>BIDAI SARIAK, PERTSONALAREN GARRAIOA ETA LOKOM...</td>\n",
       "      <td>DIETAS, LOCOMOCION Y TRASLADO DEL PERSONAL</td>\n",
       "      <td>23001</td>\n",
       "      <td>BIDAI SARIAK, PERTSONALAREN GARRAIOA ETA LOKOM...</td>\n",
       "      <td>DIETAS, LOCOMOCION Y TRASLADO DEL PERSONAL</td>\n",
       "      <td>2006/00184</td>\n",
       "      <td>Harreman Publikoetako Kabineteari lotutako gas...</td>\n",
       "      <td>500,00</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>2093 rows × 37 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       _id SOZIETATEA_EU/SOCIEDAD_EU SOZIETATEA_CAS/SOCIEDAD_CAS  \\\n",
       "0        1                      UDAL                        UDAL   \n",
       "1        2                      UDAL                        UDAL   \n",
       "2        3                      UDAL                        UDAL   \n",
       "3        4                      UDAL                        UDAL   \n",
       "4        5                      UDAL                        UDAL   \n",
       "...    ...                       ...                         ...   \n",
       "2088  2089                      UDAL                        UDAL   \n",
       "2089  2090                      UDAL                        UDAL   \n",
       "2090  2091                      UDAL                        UDAL   \n",
       "2091  2092                      UDAL                        UDAL   \n",
       "2092  2093                      UDAL                        UDAL   \n",
       "\n",
       "      EKITALDIA/EJERCICIO SAILA/DEPARTAMENTO  \\\n",
       "0                    2023                110   \n",
       "1                    2023                110   \n",
       "2                    2023                110   \n",
       "3                    2023                110   \n",
       "4                    2023                110   \n",
       "...                   ...                ...   \n",
       "2088                 2023                A10   \n",
       "2089                 2023                A10   \n",
       "2090                 2023                A10   \n",
       "2091                 2023                A10   \n",
       "2092                 2023                A10   \n",
       "\n",
       "     SAILAREN DESKRIBAPENA_EU/DESCRIPCION DEPARTAMENTO_EU  \\\n",
       "0                               KULTURA ETA GOBERNANTZA     \n",
       "1                               KULTURA ETA GOBERNANTZA     \n",
       "2                               KULTURA ETA GOBERNANTZA     \n",
       "3                               KULTURA ETA GOBERNANTZA     \n",
       "4                               KULTURA ETA GOBERNANTZA     \n",
       "...                                                 ...     \n",
       "2088                              ALKATETZAKO KABINETEA     \n",
       "2089                              ALKATETZAKO KABINETEA     \n",
       "2090                              ALKATETZAKO KABINETEA     \n",
       "2091                              ALKATETZAKO KABINETEA     \n",
       "2092                              ALKATETZAKO KABINETEA     \n",
       "\n",
       "     SAILAREN DESKRIBAPENA_EU/DESCRIPCION DEPARTAMENTO_CAS  \\\n",
       "0                                  CULTURA Y GOBERNANZA      \n",
       "1                                  CULTURA Y GOBERNANZA      \n",
       "2                                  CULTURA Y GOBERNANZA      \n",
       "3                                  CULTURA Y GOBERNANZA      \n",
       "4                                  CULTURA Y GOBERNANZA      \n",
       "...                                                 ...      \n",
       "2088                               GABINETE DE ALCALDÍA      \n",
       "2089                               GABINETE DE ALCALDÍA      \n",
       "2090                               GABINETE DE ALCALDÍA      \n",
       "2091                               GABINETE DE ALCALDÍA      \n",
       "2092                               GABINETE DE ALCALDÍA      \n",
       "\n",
       "     ZENTRU KUDEATZAILEA/CENTRO GESTOR  \\\n",
       "0                                 1100   \n",
       "1                                 1100   \n",
       "2                                 1100   \n",
       "3                                 1100   \n",
       "4                                 1100   \n",
       "...                                ...   \n",
       "2088                              A101   \n",
       "2089                              A101   \n",
       "2090                              A101   \n",
       "2091                              A101   \n",
       "2092                              A101   \n",
       "\n",
       "     ZENTRO KUDEATZAILEAREN DESKR._EUS/DESCR. CENTRO GESTOR_EUS  \\\n",
       "0                               KULTURA ETA GOBERNANTZA           \n",
       "1                               KULTURA ETA GOBERNANTZA           \n",
       "2                               KULTURA ETA GOBERNANTZA           \n",
       "3                               KULTURA ETA GOBERNANTZA           \n",
       "4                               KULTURA ETA GOBERNANTZA           \n",
       "...                                                 ...           \n",
       "2088         PROTOKOLOKO ETA HARREMAN PUBLIKOEN BULEGOA           \n",
       "2089         PROTOKOLOKO ETA HARREMAN PUBLIKOEN BULEGOA           \n",
       "2090         PROTOKOLOKO ETA HARREMAN PUBLIKOEN BULEGOA           \n",
       "2091         PROTOKOLOKO ETA HARREMAN PUBLIKOEN BULEGOA           \n",
       "2092         PROTOKOLOKO ETA HARREMAN PUBLIKOEN BULEGOA           \n",
       "\n",
       "     ZENTRO KUDEATZAILEAREN DESKR._CAS/DESCR. CENTRO GESTOR_CAS  ...  \\\n",
       "0                                  CULTURA Y GOBERNANZA          ...   \n",
       "1                                  CULTURA Y GOBERNANZA          ...   \n",
       "2                                  CULTURA Y GOBERNANZA          ...   \n",
       "3                                  CULTURA Y GOBERNANZA          ...   \n",
       "4                                  CULTURA Y GOBERNANZA          ...   \n",
       "...                                                 ...          ...   \n",
       "2088         OFICINA PROTOCOLO, RR.PP. E NSTITUCIONALES          ...   \n",
       "2089         OFICINA PROTOCOLO, RR.PP. E NSTITUCIONALES          ...   \n",
       "2090         OFICINA PROTOCOLO, RR.PP. E NSTITUCIONALES          ...   \n",
       "2091         OFICINA PROTOCOLO, RR.PP. E NSTITUCIONALES          ...   \n",
       "2092         OFICINA PROTOCOLO, RR.PP. E NSTITUCIONALES          ...   \n",
       "\n",
       "      ARTIKULUAREN DESKRIBAPENA_CAS/DESCRIPCION ARTICULO_CAS  \\\n",
       "0                         RETRIBUCIONES DE ALTOS CARGOS        \n",
       "1      RETRIBUCIONES DEL PERSONAL EVENTUAL DE GABINETES        \n",
       "2                RETRIBUCIONES DEL PERSONAL FUNCIONARIO        \n",
       "3                RETRIBUCIONES DEL PERSONAL FUNCIONARIO        \n",
       "4                RETRIBUCIONES DEL PERSONAL FUNCIONARIO        \n",
       "...                                                 ...        \n",
       "2088                      MATERIAL, SUMINISTROS Y OTROS        \n",
       "2089                      MATERIAL, SUMINISTROS Y OTROS        \n",
       "2090                      MATERIAL, SUMINISTROS Y OTROS        \n",
       "2091                      MATERIAL, SUMINISTROS Y OTROS        \n",
       "2092             INDEMNIZACIONES POR RAZON DEL SERVICIO        \n",
       "\n",
       "     KONTZEPTUA/CONCEPTO  \\\n",
       "0                    100   \n",
       "1                    110   \n",
       "2                    120   \n",
       "3                    121   \n",
       "4                    121   \n",
       "...                  ...   \n",
       "2088                 227   \n",
       "2089                 227   \n",
       "2090                 227   \n",
       "2091                 227   \n",
       "2092                 230   \n",
       "\n",
       "     KONTZEPTUAREN DESKRIBAPENA_EUS/DESCRIPCION CONCEPTO_EUS  \\\n",
       "0     GOI KARGUEN OINARRIZKO SOLDATAK ETA BESTELAKO ...        \n",
       "1     KABINETEETAKO ALDI BAT. PERTSONALAREN OINARRIZ...        \n",
       "2                    FUNTZIONARIOEN OINARRIZKO SOLDATAK        \n",
       "3                  FUNTZIONARIOEN ORDAINSARI OSAGARRIAK        \n",
       "4                  FUNTZIONARIOEN ORDAINSARI OSAGARRIAK        \n",
       "...                                                 ...        \n",
       "2088                    KANPOKO ENPRESEK EGINDAKO LANAK        \n",
       "2089                    KANPOKO ENPRESEK EGINDAKO LANAK        \n",
       "2090                    KANPOKO ENPRESEK EGINDAKO LANAK        \n",
       "2091                    KANPOKO ENPRESEK EGINDAKO LANAK        \n",
       "2092  BIDAI SARIAK, PERTSONALAREN GARRAIOA ETA LOKOM...        \n",
       "\n",
       "      KONTZEPTUAREN DESKRIBAPENA_CAS/DESCRIPCION CONCEPTO_CAS  \\\n",
       "0     RETRIBUCIONES BASICAS Y OTRAS REMUNERACIONES D...         \n",
       "1     RETRIBUCIONES BASICAS Y OTRAS REMUNERACIONES D...         \n",
       "2        RETRIBUCIONES BASICAS DEL PERSONAL FUNCIONARIO         \n",
       "3     RETRIBUCIONES COMPLEMENTARIAS DEL PERSONAL FUN...         \n",
       "4     RETRIBUCIONES COMPLEMENTARIAS DEL PERSONAL FUN...         \n",
       "...                                                 ...         \n",
       "2088    TRABAJOS REALIZADOS POR OTRAS EMPRESAS EXTERNAS         \n",
       "2089    TRABAJOS REALIZADOS POR OTRAS EMPRESAS EXTERNAS         \n",
       "2090    TRABAJOS REALIZADOS POR OTRAS EMPRESAS EXTERNAS         \n",
       "2091    TRABAJOS REALIZADOS POR OTRAS EMPRESAS EXTERNAS         \n",
       "2092         DIETAS, LOCOMOCION Y TRASLADO DEL PERSONAL         \n",
       "\n",
       "     AZPIKONTZEPTUA/SUBCONCEPTO  \\\n",
       "0                         10001   \n",
       "1                         11001   \n",
       "2                         12001   \n",
       "3                         12101   \n",
       "4                         12102   \n",
       "...                         ...   \n",
       "2088                      22717   \n",
       "2089                      22720   \n",
       "2090                      22723   \n",
       "2091                      22799   \n",
       "2092                      23001   \n",
       "\n",
       "     AZPIKONTZEPTUAREN DESKRIBAPENA_EUS/DESCRIPCION SUBCONCEPTO_EUS  \\\n",
       "0     GOI KARGUEN OINARRIZKO SOLDATAK ETA BESTELAKO ...               \n",
       "1     KABINETEETAKO ALDI BAT. PERTSONALAREN OINARRIZ...               \n",
       "2                    FUNTZIONARIOEN OINARRIZKO SOLDATAK               \n",
       "3                                     LANTOKI OSAGARRIA               \n",
       "4                                      OSAGARRI BEREZIA               \n",
       "...                                                 ...               \n",
       "2088                      BIDAI AGENTZIETAKO ZERBITZUAK               \n",
       "2089            ITZULPEN ETA INTERPRETARITZA ZERBITZUAK               \n",
       "2090                      ZERBITZUAK:ARGIAK ETA SOINUAK               \n",
       "2091          KANPOKO ENPRESEK EGINDAKO BESTELAKO LANAK               \n",
       "2092  BIDAI SARIAK, PERTSONALAREN GARRAIOA ETA LOKOM...               \n",
       "\n",
       "      AZPIKONTZEPTUAREN DESKRIBAPENA_CAS/DESCRIPCION SUBCONCEPTO_CAS  \\\n",
       "0     RETRIBUCIONES BASICAS Y OTRAS REMUNERACIONES D...                \n",
       "1     RETRIBUCIONES BASICAS Y OTRAS REMUNERACIONES D...                \n",
       "2        RETRIBUCIONES BASICAS DEL PERSONAL FUNCIONARIO                \n",
       "3                                COMPLEMENTO DE DESTINO                \n",
       "4                                COMPLEMENTO ESPECIFICO                \n",
       "...                                                 ...                \n",
       "2088                    SERVICIOS DE AGENCIAS DE VIAJES                \n",
       "2089              SERVICIOS DE TRADUCCION E INTERPRETES                \n",
       "2090                  SERVICIOS DE ILUMINACIÓN Y SONIDO                \n",
       "2091    OTROS TRABAJOS REALIZADOS POR EMPRESAS EXTERNAS                \n",
       "2092         DIETAS, LOCOMOCION Y TRASLADO DEL PERSONAL                \n",
       "\n",
       "     PROIEKTUA/PROYECTO     PROIEKTUAREN DESKRIBAPENA/DESCRIPCION PROYECTO  \\\n",
       "0            9999/99999                                GENERIKOA/ GENÉRICO   \n",
       "1            9999/99999                                GENERIKOA/ GENÉRICO   \n",
       "2            9999/99999                                GENERIKOA/ GENÉRICO   \n",
       "3            9999/99999                                GENERIKOA/ GENÉRICO   \n",
       "4            9999/99999                                GENERIKOA/ GENÉRICO   \n",
       "...                 ...                                                ...   \n",
       "2088         2006/00184  Harreman Publikoetako Kabineteari lotutako gas...   \n",
       "2089         2006/00182  Protokoloko eta ordezkaritzako arreta/Harreman...   \n",
       "2090         2006/00182  Protokoloko eta ordezkaritzako arreta/Harreman...   \n",
       "2091         2006/00182  Protokoloko eta ordezkaritzako arreta/Harreman...   \n",
       "2092         2006/00184  Harreman Publikoetako Kabineteari lotutako gas...   \n",
       "\n",
       "      HASIERAKO KREDITUA/CREDITO INICIAL 2023  \n",
       "0                                  220.619,00  \n",
       "1                                  589.261,00  \n",
       "2                                  383.369,00  \n",
       "3                                  131.722,00  \n",
       "4                                  396.609,00  \n",
       "...                                       ...  \n",
       "2088                                   500,00  \n",
       "2089                                 2.000,00  \n",
       "2090                                10.000,00  \n",
       "2091                                48.400,00  \n",
       "2092                                   500,00  \n",
       "\n",
       "[2093 rows x 37 columns]"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_gastos"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "###  Codificación de caracteres\n",
    "\n",
    "Python utilitza una representación basada en Unicode (https://home.unicode.org/). Otros sistemas operativos y programas utilizan otro tipo de representaciones.\n",
    "\n",
    "```python\n",
    "var = \"camión\"\n",
    "var = \"lul·lià\"\n",
    "var = \"Ζεύς\"\n",
    "var = \"ประเทศไทย\"\n",
    "var = \"日本語で\"\n",
    "```\n",
    "\n",
    "\n",
    "Codificaciones:\n",
    "- [listado de codificaciones](https://docs.python.org/3.11/library/codecs.html#standard-encodings)\n",
    "- [UTF-8](https://es.wikipedia.org/wiki/UTF-8)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "ename": "UnicodeDecodeError",
     "evalue": "'utf-8' codec can't decode byte 0xc1 in position 1697: invalid start byte",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mUnicodeDecodeError\u001b[0m                        Traceback (most recent call last)",
      "\u001b[1;32m/Users/isaac/Projects/TxADM_notebooks/notebooks/Part2/00_Pandas/01_Introduccion.ipynb Cell 32\u001b[0m line \u001b[0;36m<cell line: 3>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      <a href='vscode-notebook-cell:/Users/isaac/Projects/TxADM_notebooks/notebooks/Part2/00_Pandas/01_Introduccion.ipynb#X35sZmlsZQ%3D%3D?line=0'>1</a>\u001b[0m \u001b[39m# cp1250 | windows-1250 | Central and Eastern Europe\u001b[39;00m\n\u001b[0;32m----> <a href='vscode-notebook-cell:/Users/isaac/Projects/TxADM_notebooks/notebooks/Part2/00_Pandas/01_Introduccion.ipynb#X35sZmlsZQ%3D%3D?line=2'>3</a>\u001b[0m df_gastos \u001b[39m=\u001b[39m pd\u001b[39m.\u001b[39;49mread_csv(\u001b[39m\"\u001b[39;49m\u001b[39mdata/presupuesto_gastos_2023.csv\u001b[39;49m\u001b[39m\"\u001b[39;49m,sep\u001b[39m=\u001b[39;49m\u001b[39m\"\u001b[39;49m\u001b[39m;\u001b[39;49m\u001b[39m\"\u001b[39;49m)\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/util/_decorators.py:211\u001b[0m, in \u001b[0;36mdeprecate_kwarg.<locals>._deprecate_kwarg.<locals>.wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m    209\u001b[0m     \u001b[39melse\u001b[39;00m:\n\u001b[1;32m    210\u001b[0m         kwargs[new_arg_name] \u001b[39m=\u001b[39m new_arg_value\n\u001b[0;32m--> 211\u001b[0m \u001b[39mreturn\u001b[39;00m func(\u001b[39m*\u001b[39;49margs, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwargs)\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/util/_decorators.py:331\u001b[0m, in \u001b[0;36mdeprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m    325\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mlen\u001b[39m(args) \u001b[39m>\u001b[39m num_allow_args:\n\u001b[1;32m    326\u001b[0m     warnings\u001b[39m.\u001b[39mwarn(\n\u001b[1;32m    327\u001b[0m         msg\u001b[39m.\u001b[39mformat(arguments\u001b[39m=\u001b[39m_format_argument_list(allow_args)),\n\u001b[1;32m    328\u001b[0m         \u001b[39mFutureWarning\u001b[39;00m,\n\u001b[1;32m    329\u001b[0m         stacklevel\u001b[39m=\u001b[39mfind_stack_level(),\n\u001b[1;32m    330\u001b[0m     )\n\u001b[0;32m--> 331\u001b[0m \u001b[39mreturn\u001b[39;00m func(\u001b[39m*\u001b[39;49margs, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwargs)\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/io/parsers/readers.py:950\u001b[0m, in \u001b[0;36mread_csv\u001b[0;34m(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)\u001b[0m\n\u001b[1;32m    935\u001b[0m kwds_defaults \u001b[39m=\u001b[39m _refine_defaults_read(\n\u001b[1;32m    936\u001b[0m     dialect,\n\u001b[1;32m    937\u001b[0m     delimiter,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    946\u001b[0m     defaults\u001b[39m=\u001b[39m{\u001b[39m\"\u001b[39m\u001b[39mdelimiter\u001b[39m\u001b[39m\"\u001b[39m: \u001b[39m\"\u001b[39m\u001b[39m,\u001b[39m\u001b[39m\"\u001b[39m},\n\u001b[1;32m    947\u001b[0m )\n\u001b[1;32m    948\u001b[0m kwds\u001b[39m.\u001b[39mupdate(kwds_defaults)\n\u001b[0;32m--> 950\u001b[0m \u001b[39mreturn\u001b[39;00m _read(filepath_or_buffer, kwds)\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/io/parsers/readers.py:605\u001b[0m, in \u001b[0;36m_read\u001b[0;34m(filepath_or_buffer, kwds)\u001b[0m\n\u001b[1;32m    602\u001b[0m _validate_names(kwds\u001b[39m.\u001b[39mget(\u001b[39m\"\u001b[39m\u001b[39mnames\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39mNone\u001b[39;00m))\n\u001b[1;32m    604\u001b[0m \u001b[39m# Create the parser.\u001b[39;00m\n\u001b[0;32m--> 605\u001b[0m parser \u001b[39m=\u001b[39m TextFileReader(filepath_or_buffer, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwds)\n\u001b[1;32m    607\u001b[0m \u001b[39mif\u001b[39;00m chunksize \u001b[39mor\u001b[39;00m iterator:\n\u001b[1;32m    608\u001b[0m     \u001b[39mreturn\u001b[39;00m parser\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/io/parsers/readers.py:1442\u001b[0m, in \u001b[0;36mTextFileReader.__init__\u001b[0;34m(self, f, engine, **kwds)\u001b[0m\n\u001b[1;32m   1439\u001b[0m     \u001b[39mself\u001b[39m\u001b[39m.\u001b[39moptions[\u001b[39m\"\u001b[39m\u001b[39mhas_index_names\u001b[39m\u001b[39m\"\u001b[39m] \u001b[39m=\u001b[39m kwds[\u001b[39m\"\u001b[39m\u001b[39mhas_index_names\u001b[39m\u001b[39m\"\u001b[39m]\n\u001b[1;32m   1441\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mhandles: IOHandles \u001b[39m|\u001b[39m \u001b[39mNone\u001b[39;00m \u001b[39m=\u001b[39m \u001b[39mNone\u001b[39;00m\n\u001b[0;32m-> 1442\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_engine \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_make_engine(f, \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mengine)\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/io/parsers/readers.py:1753\u001b[0m, in \u001b[0;36mTextFileReader._make_engine\u001b[0;34m(self, f, engine)\u001b[0m\n\u001b[1;32m   1750\u001b[0m     \u001b[39mraise\u001b[39;00m \u001b[39mValueError\u001b[39;00m(msg)\n\u001b[1;32m   1752\u001b[0m \u001b[39mtry\u001b[39;00m:\n\u001b[0;32m-> 1753\u001b[0m     \u001b[39mreturn\u001b[39;00m mapping[engine](f, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49moptions)\n\u001b[1;32m   1754\u001b[0m \u001b[39mexcept\u001b[39;00m \u001b[39mException\u001b[39;00m:\n\u001b[1;32m   1755\u001b[0m     \u001b[39mif\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mhandles \u001b[39mis\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mNone\u001b[39;00m:\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py:79\u001b[0m, in \u001b[0;36mCParserWrapper.__init__\u001b[0;34m(self, src, **kwds)\u001b[0m\n\u001b[1;32m     76\u001b[0m     kwds\u001b[39m.\u001b[39mpop(key, \u001b[39mNone\u001b[39;00m)\n\u001b[1;32m     78\u001b[0m kwds[\u001b[39m\"\u001b[39m\u001b[39mdtype\u001b[39m\u001b[39m\"\u001b[39m] \u001b[39m=\u001b[39m ensure_dtype_objs(kwds\u001b[39m.\u001b[39mget(\u001b[39m\"\u001b[39m\u001b[39mdtype\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39mNone\u001b[39;00m))\n\u001b[0;32m---> 79\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_reader \u001b[39m=\u001b[39m parsers\u001b[39m.\u001b[39;49mTextReader(src, \u001b[39m*\u001b[39;49m\u001b[39m*\u001b[39;49mkwds)\n\u001b[1;32m     81\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39munnamed_cols \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_reader\u001b[39m.\u001b[39munnamed_cols\n\u001b[1;32m     83\u001b[0m \u001b[39m# error: Cannot determine type of 'names'\u001b[39;00m\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/_libs/parsers.pyx:547\u001b[0m, in \u001b[0;36mpandas._libs.parsers.TextReader.__cinit__\u001b[0;34m()\u001b[0m\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/_libs/parsers.pyx:636\u001b[0m, in \u001b[0;36mpandas._libs.parsers.TextReader._get_header\u001b[0;34m()\u001b[0m\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/_libs/parsers.pyx:852\u001b[0m, in \u001b[0;36mpandas._libs.parsers.TextReader._tokenize_rows\u001b[0;34m()\u001b[0m\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/_libs/parsers.pyx:1965\u001b[0m, in \u001b[0;36mpandas._libs.parsers.raise_parser_error\u001b[0;34m()\u001b[0m\n",
      "\u001b[0;31mUnicodeDecodeError\u001b[0m: 'utf-8' codec can't decode byte 0xc1 in position 1697: invalid start byte"
     ]
    }
   ],
   "source": [
    "# cp1250 | windows-1250 | Central and Eastern Europe\n",
    "df_gastos = pd.read_csv(\"data/presupuesto_gastos_2023.csv\",sep=\";\")  #Aun tenemos otro fallo, vamos a verlo"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_gastos"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_gastos = pd.read_csv(\"data/presupuesto_gastos_2023.csv\",delimiter=\"\\t\",encoding=\"cp1250\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Otras maneras de cargar datos\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "# por la dirección del fichero en web\n",
    "df_who = pd.read_csv(\"http://www.exploredata.net/ftp/WHO.csv\") #dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>151.0</td>\n",
       "      <td>28.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>26088.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>...</td>\n",
       "      <td>692.50</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>257.00</td>\n",
       "      <td>231.9</td>\n",
       "      <td>257.00</td>\n",
       "      <td>5740436.0</td>\n",
       "      <td>5.44</td>\n",
       "      <td>22.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>27.0</td>\n",
       "      <td>98.7</td>\n",
       "      <td>6000.0</td>\n",
       "      <td>93.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>3172.0</td>\n",
       "      <td>0.6</td>\n",
       "      <td>...</td>\n",
       "      <td>3499.12</td>\n",
       "      <td>4.790000e+09</td>\n",
       "      <td>78.14</td>\n",
       "      <td>-2.040000e+09</td>\n",
       "      <td>18.47</td>\n",
       "      <td>15.5</td>\n",
       "      <td>18.47</td>\n",
       "      <td>1431793.9</td>\n",
       "      <td>2.21</td>\n",
       "      <td>45.4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>2 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "0  Afghanistan          1          1                          151.0   \n",
       "1      Albania          2          2                           27.0   \n",
       "\n",
       "   Adult literacy rate (%)  \\\n",
       "0                     28.0   \n",
       "1                     98.7   \n",
       "\n",
       "   Gross national income per capita (PPP international $)  \\\n",
       "0                                                NaN        \n",
       "1                                             6000.0        \n",
       "\n",
       "   Net primary school enrolment ratio female (%)  \\\n",
       "0                                            NaN   \n",
       "1                                           93.0   \n",
       "\n",
       "   Net primary school enrolment ratio male (%)  \\\n",
       "0                                          NaN   \n",
       "1                                         94.0   \n",
       "\n",
       "   Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "0                          26088.0                                4.0  ...   \n",
       "1                           3172.0                                0.6  ...   \n",
       "\n",
       "   Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "0               692.50           NaN             NaN   \n",
       "1              3499.12  4.790000e+09           78.14   \n",
       "\n",
       "   Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "0                               NaN                         257.00   \n",
       "1                     -2.040000e+09                          18.47   \n",
       "\n",
       "   Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "0                           231.9                     257.00   \n",
       "1                            15.5                      18.47   \n",
       "\n",
       "   Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "0         5740436.0                     5.44                           22.9  \n",
       "1         1431793.9                     2.21                           45.4  \n",
       "\n",
       "[2 rows x 358 columns]"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who.head(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Unnamed: 0</th>\n",
       "      <th>Nombre</th>\n",
       "      <th>Capital (de iure o en su defecto de facto)</th>\n",
       "      <th>Población (2022)</th>\n",
       "      <th>Porcentaje población</th>\n",
       "      <th>Densidad (hab./km²)</th>\n",
       "      <th>Superficie (km²)</th>\n",
       "      <th>Porcentaje superficie</th>\n",
       "      <th>Mapa</th>\n",
       "      <th>PIB per cápita en € (2021)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Andalucía</td>\n",
       "      <td>Sevilla</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17,87 %</td>\n",
       "      <td>9705</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17,31 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Cataluña</td>\n",
       "      <td>Barcelona</td>\n",
       "      <td>NaN</td>\n",
       "      <td>16,38 %</td>\n",
       "      <td>23884</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6,35 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>Comunidad de Madrid</td>\n",
       "      <td>Madrid</td>\n",
       "      <td>NaN</td>\n",
       "      <td>14,24 %</td>\n",
       "      <td>84115</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1,59 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>Comunidad Valenciana</td>\n",
       "      <td>Valencia</td>\n",
       "      <td>NaN</td>\n",
       "      <td>10,67 %</td>\n",
       "      <td>21698</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4,60 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>Galicia</td>\n",
       "      <td>Santiago de Compostela</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5,68 %</td>\n",
       "      <td>9119</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5,84 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>6</td>\n",
       "      <td>Castilla y León</td>\n",
       "      <td>Valladolid[nota 1]​</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5,02 %</td>\n",
       "      <td>2534</td>\n",
       "      <td>NaN</td>\n",
       "      <td>18,62 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>7</td>\n",
       "      <td>País Vasco</td>\n",
       "      <td>Vitoria[nota 1]​</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4,67 %</td>\n",
       "      <td>30213</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1,43 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>8</td>\n",
       "      <td>Canarias</td>\n",
       "      <td>Las Palmas de Gran Canaria y Santa Cruz de Ten...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4,58 %</td>\n",
       "      <td>30139</td>\n",
       "      <td>[nota 3]​</td>\n",
       "      <td>1,47 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>9</td>\n",
       "      <td>Castilla-La Mancha</td>\n",
       "      <td>Toledo[nota 1]​</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4,32 %</td>\n",
       "      <td>2579</td>\n",
       "      <td>NaN</td>\n",
       "      <td>15,70 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>10</td>\n",
       "      <td>Región de Murcia</td>\n",
       "      <td>Murcia</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3,20 %</td>\n",
       "      <td>13374</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2,24 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>11</td>\n",
       "      <td>Aragón</td>\n",
       "      <td>Zaragoza</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2,79 %</td>\n",
       "      <td>2790</td>\n",
       "      <td>NaN</td>\n",
       "      <td>9,43 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>12</td>\n",
       "      <td>Islas Baleares</td>\n",
       "      <td>Palma</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2,47 %</td>\n",
       "      <td>24427</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0,99 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>13</td>\n",
       "      <td>Extremadura</td>\n",
       "      <td>Mérida</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2,23 %</td>\n",
       "      <td>2541</td>\n",
       "      <td>NaN</td>\n",
       "      <td>8,23 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>14</td>\n",
       "      <td>Principado de Asturias</td>\n",
       "      <td>Oviedo</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2,14 %</td>\n",
       "      <td>9553</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2,13 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>15</td>\n",
       "      <td>Comunidad Foral de Navarra</td>\n",
       "      <td>Pamplona</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1,39 %</td>\n",
       "      <td>6330</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2,05 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>16</td>\n",
       "      <td>Cantabria</td>\n",
       "      <td>Santander</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1,23 %</td>\n",
       "      <td>10974</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1,05 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>17</td>\n",
       "      <td>La Rioja</td>\n",
       "      <td>Logroño</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0,67 %</td>\n",
       "      <td>6267</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1,00 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>18</td>\n",
       "      <td>Melilla</td>\n",
       "      <td>Melilla</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0,18 %</td>\n",
       "      <td>700158</td>\n",
       "      <td>NaN</td>\n",
       "      <td>&lt;0,01 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>19</td>\n",
       "      <td>Ceuta</td>\n",
       "      <td>Ceuta</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0,18 %</td>\n",
       "      <td>417510</td>\n",
       "      <td>NaN</td>\n",
       "      <td>&lt;0,01 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>TOTAL</td>\n",
       "      <td>España</td>\n",
       "      <td>Madrid</td>\n",
       "      <td>NaN</td>\n",
       "      <td>100 %</td>\n",
       "      <td>9367</td>\n",
       "      <td>NaN</td>\n",
       "      <td>100 %</td>\n",
       "      <td>—</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Unnamed: 0                      Nombre  \\\n",
       "0           1                   Andalucía   \n",
       "1           2                    Cataluña   \n",
       "2           3         Comunidad de Madrid   \n",
       "3           4        Comunidad Valenciana   \n",
       "4           5                     Galicia   \n",
       "5           6             Castilla y León   \n",
       "6           7                  País Vasco   \n",
       "7           8                    Canarias   \n",
       "8           9          Castilla-La Mancha   \n",
       "9          10            Región de Murcia   \n",
       "10         11                      Aragón   \n",
       "11         12              Islas Baleares   \n",
       "12         13                 Extremadura   \n",
       "13         14      Principado de Asturias   \n",
       "14         15  Comunidad Foral de Navarra   \n",
       "15         16                   Cantabria   \n",
       "16         17                    La Rioja   \n",
       "17         18                     Melilla   \n",
       "18         19                       Ceuta   \n",
       "19      TOTAL                      España   \n",
       "\n",
       "           Capital (de iure o en su defecto de facto)  Población (2022)  \\\n",
       "0                                             Sevilla               NaN   \n",
       "1                                           Barcelona               NaN   \n",
       "2                                              Madrid               NaN   \n",
       "3                                            Valencia               NaN   \n",
       "4                              Santiago de Compostela               NaN   \n",
       "5                                 Valladolid[nota 1]​               NaN   \n",
       "6                                    Vitoria[nota 1]​               NaN   \n",
       "7   Las Palmas de Gran Canaria y Santa Cruz de Ten...               NaN   \n",
       "8                                     Toledo[nota 1]​               NaN   \n",
       "9                                              Murcia               NaN   \n",
       "10                                           Zaragoza               NaN   \n",
       "11                                              Palma               NaN   \n",
       "12                                             Mérida               NaN   \n",
       "13                                             Oviedo               NaN   \n",
       "14                                           Pamplona               NaN   \n",
       "15                                          Santander               NaN   \n",
       "16                                            Logroño               NaN   \n",
       "17                                            Melilla               NaN   \n",
       "18                                              Ceuta               NaN   \n",
       "19                                             Madrid               NaN   \n",
       "\n",
       "   Porcentaje población  Densidad (hab./km²) Superficie (km²)  \\\n",
       "0               17,87 %                 9705              NaN   \n",
       "1               16,38 %                23884              NaN   \n",
       "2               14,24 %                84115              NaN   \n",
       "3               10,67 %                21698              NaN   \n",
       "4                5,68 %                 9119              NaN   \n",
       "5                5,02 %                 2534              NaN   \n",
       "6                4,67 %                30213              NaN   \n",
       "7                4,58 %                30139        [nota 3]​   \n",
       "8                4,32 %                 2579              NaN   \n",
       "9                3,20 %                13374              NaN   \n",
       "10               2,79 %                 2790              NaN   \n",
       "11               2,47 %                24427              NaN   \n",
       "12               2,23 %                 2541              NaN   \n",
       "13               2,14 %                 9553              NaN   \n",
       "14               1,39 %                 6330              NaN   \n",
       "15               1,23 %                10974              NaN   \n",
       "16               0,67 %                 6267              NaN   \n",
       "17               0,18 %               700158              NaN   \n",
       "18               0,18 %               417510              NaN   \n",
       "19                100 %                 9367              NaN   \n",
       "\n",
       "   Porcentaje superficie Mapa  PIB per cápita en € (2021)  \n",
       "0                17,31 %  NaN                         NaN  \n",
       "1                 6,35 %  NaN                         NaN  \n",
       "2                 1,59 %  NaN                         NaN  \n",
       "3                 4,60 %  NaN                         NaN  \n",
       "4                 5,84 %  NaN                         NaN  \n",
       "5                18,62 %  NaN                         NaN  \n",
       "6                 1,43 %  NaN                         NaN  \n",
       "7                 1,47 %  NaN                         NaN  \n",
       "8                15,70 %  NaN                         NaN  \n",
       "9                 2,24 %  NaN                         NaN  \n",
       "10                9,43 %  NaN                         NaN  \n",
       "11                0,99 %  NaN                         NaN  \n",
       "12                8,23 %  NaN                         NaN  \n",
       "13                2,13 %  NaN                         NaN  \n",
       "14                2,05 %  NaN                         NaN  \n",
       "15                1,05 %  NaN                         NaN  \n",
       "16                1,00 %  NaN                         NaN  \n",
       "17               <0,01 %  NaN                         NaN  \n",
       "18               <0,01 %  NaN                         NaN  \n",
       "19                 100 %    —                         NaN  "
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# extracción de contenido en una página web\n",
    "url = \"https://es.wikipedia.org/wiki/Anexo:Comunidades_y_ciudades_aut%C3%B3nomas_de_Espa%C3%B1a\" \n",
    "\n",
    "comunidades_esp = pd.io.html.read_html(url) \n",
    "comunidades_esp[0] # Alerta! \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Unnamed: 0</th>\n",
       "      <th>Nombre</th>\n",
       "      <th>Capital (de iure o en su defecto de facto)</th>\n",
       "      <th>Población (2022)</th>\n",
       "      <th>Porcentaje población</th>\n",
       "      <th>Densidad (hab./km²)</th>\n",
       "      <th>Superficie (km²)</th>\n",
       "      <th>Porcentaje superficie</th>\n",
       "      <th>Mapa</th>\n",
       "      <th>PIB per cápita en € (2021)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Andalucía</td>\n",
       "      <td>Sevilla</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17,87 %</td>\n",
       "      <td>9705</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17,31 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Cataluña</td>\n",
       "      <td>Barcelona</td>\n",
       "      <td>NaN</td>\n",
       "      <td>16,38 %</td>\n",
       "      <td>23884</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6,35 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>Comunidad de Madrid</td>\n",
       "      <td>Madrid</td>\n",
       "      <td>NaN</td>\n",
       "      <td>14,24 %</td>\n",
       "      <td>84115</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1,59 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>Comunidad Valenciana</td>\n",
       "      <td>Valencia</td>\n",
       "      <td>NaN</td>\n",
       "      <td>10,67 %</td>\n",
       "      <td>21698</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4,60 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>Galicia</td>\n",
       "      <td>Santiago de Compostela</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5,68 %</td>\n",
       "      <td>9119</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5,84 %</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Unnamed: 0                Nombre Capital (de iure o en su defecto de facto)  \\\n",
       "0          1             Andalucía                                    Sevilla   \n",
       "1          2              Cataluña                                  Barcelona   \n",
       "2          3   Comunidad de Madrid                                     Madrid   \n",
       "3          4  Comunidad Valenciana                                   Valencia   \n",
       "4          5               Galicia                     Santiago de Compostela   \n",
       "\n",
       "   Población (2022) Porcentaje población  Densidad (hab./km²)  \\\n",
       "0               NaN              17,87 %                 9705   \n",
       "1               NaN              16,38 %                23884   \n",
       "2               NaN              14,24 %                84115   \n",
       "3               NaN              10,67 %                21698   \n",
       "4               NaN               5,68 %                 9119   \n",
       "\n",
       "  Superficie (km²) Porcentaje superficie Mapa  PIB per cápita en € (2021)  \n",
       "0              NaN               17,31 %  NaN                         NaN  \n",
       "1              NaN                6,35 %  NaN                         NaN  \n",
       "2              NaN                1,59 %  NaN                         NaN  \n",
       "3              NaN                4,60 %  NaN                         NaN  \n",
       "4              NaN                5,84 %  NaN                         NaN  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "print(type(comunidades_esp[0]))\n",
    "df_comunidades_esp = comunidades_esp[0]\n",
    "df_comunidades_esp.head() "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date_and_time_of_initial</th>\n",
       "      <th>date_and_time_of_ranger</th>\n",
       "      <th>borough</th>\n",
       "      <th>property</th>\n",
       "      <th>location</th>\n",
       "      <th>species_description</th>\n",
       "      <th>call_source</th>\n",
       "      <th>species_status</th>\n",
       "      <th>animal_condition</th>\n",
       "      <th>duration_of_response</th>\n",
       "      <th>...</th>\n",
       "      <th>_311sr_number</th>\n",
       "      <th>final_ranger_action</th>\n",
       "      <th>of_animals</th>\n",
       "      <th>pep_response</th>\n",
       "      <th>animal_monitored</th>\n",
       "      <th>police_response</th>\n",
       "      <th>esu_response</th>\n",
       "      <th>acc_intake_number</th>\n",
       "      <th>hours_spent_monitoring</th>\n",
       "      <th>rehabilitator</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2021-06-23T16:45:00.000</td>\n",
       "      <td>2021-06-24T08:00:00.000</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Sternberg Park</td>\n",
       "      <td>Inside locked athletic field under construction</td>\n",
       "      <td>Chukar</td>\n",
       "      <td>Other</td>\n",
       "      <td>Exotic</td>\n",
       "      <td>Healthy</td>\n",
       "      <td>6.00</td>\n",
       "      <td>...</td>\n",
       "      <td>311-06712416</td>\n",
       "      <td>ACC</td>\n",
       "      <td>6</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>163537</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2021-06-24T10:00:00.000</td>\n",
       "      <td>2021-06-24T11:00:00.000</td>\n",
       "      <td>Bronx</td>\n",
       "      <td>Haffen Park</td>\n",
       "      <td>Haffen Pool</td>\n",
       "      <td>Sparrow</td>\n",
       "      <td>Central</td>\n",
       "      <td>Native</td>\n",
       "      <td>Healthy</td>\n",
       "      <td>1.75</td>\n",
       "      <td>...</td>\n",
       "      <td>311-06714879</td>\n",
       "      <td>Rehabilitator</td>\n",
       "      <td>4</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2021-06-23T14:30:00.000</td>\n",
       "      <td>2021-06-23T14:30:00.000</td>\n",
       "      <td>Bronx</td>\n",
       "      <td>Pelham Bay Park</td>\n",
       "      <td>Pelham Bay South</td>\n",
       "      <td>White-tailed Deer</td>\n",
       "      <td>Employee</td>\n",
       "      <td>Native</td>\n",
       "      <td>N/A</td>\n",
       "      <td>1.00</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Unfounded</td>\n",
       "      <td>0</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2021-06-23T13:00:00.000</td>\n",
       "      <td>2021-06-23T13:10:00.000</td>\n",
       "      <td>Staten Island</td>\n",
       "      <td>Willowbrook Park</td>\n",
       "      <td>The carousel</td>\n",
       "      <td>Raccoon</td>\n",
       "      <td>Employee</td>\n",
       "      <td>Native</td>\n",
       "      <td>N/A</td>\n",
       "      <td>2.00</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Unfounded</td>\n",
       "      <td>0</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2021-06-23T09:20:00.000</td>\n",
       "      <td>2021-06-23T09:20:00.000</td>\n",
       "      <td>Queens</td>\n",
       "      <td>Judge Moses Weinstein Playground</td>\n",
       "      <td>Garbage can</td>\n",
       "      <td>Virginia Opossum</td>\n",
       "      <td>Central</td>\n",
       "      <td>Native</td>\n",
       "      <td>Healthy</td>\n",
       "      <td>2.25</td>\n",
       "      <td>...</td>\n",
       "      <td>311-06699415</td>\n",
       "      <td>ACC</td>\n",
       "      <td>1</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>119833</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 22 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "  date_and_time_of_initial  date_and_time_of_ranger        borough  \\\n",
       "0  2021-06-23T16:45:00.000  2021-06-24T08:00:00.000       Brooklyn   \n",
       "1  2021-06-24T10:00:00.000  2021-06-24T11:00:00.000          Bronx   \n",
       "2  2021-06-23T14:30:00.000  2021-06-23T14:30:00.000          Bronx   \n",
       "3  2021-06-23T13:00:00.000  2021-06-23T13:10:00.000  Staten Island   \n",
       "4  2021-06-23T09:20:00.000  2021-06-23T09:20:00.000         Queens   \n",
       "\n",
       "                           property  \\\n",
       "0                    Sternberg Park   \n",
       "1                       Haffen Park   \n",
       "2                   Pelham Bay Park   \n",
       "3                  Willowbrook Park   \n",
       "4  Judge Moses Weinstein Playground   \n",
       "\n",
       "                                          location species_description  \\\n",
       "0  Inside locked athletic field under construction              Chukar   \n",
       "1                                      Haffen Pool             Sparrow   \n",
       "2                                 Pelham Bay South   White-tailed Deer   \n",
       "3                                     The carousel             Raccoon   \n",
       "4                                      Garbage can    Virginia Opossum   \n",
       "\n",
       "  call_source species_status animal_condition  duration_of_response  ...  \\\n",
       "0       Other         Exotic          Healthy                  6.00  ...   \n",
       "1     Central         Native          Healthy                  1.75  ...   \n",
       "2    Employee         Native              N/A                  1.00  ...   \n",
       "3    Employee         Native              N/A                  2.00  ...   \n",
       "4     Central         Native          Healthy                  2.25  ...   \n",
       "\n",
       "  _311sr_number final_ranger_action of_animals pep_response  animal_monitored  \\\n",
       "0  311-06712416                 ACC          6        False             False   \n",
       "1  311-06714879       Rehabilitator          4        False             False   \n",
       "2           NaN           Unfounded          0        False             False   \n",
       "3           NaN           Unfounded          0        False             False   \n",
       "4  311-06699415                 ACC          1        False             False   \n",
       "\n",
       "   police_response  esu_response  acc_intake_number  hours_spent_monitoring  \\\n",
       "0            False         False             163537                     NaN   \n",
       "1            False         False                NaN                     NaN   \n",
       "2            False         False                NaN                     NaN   \n",
       "3            False         False                NaN                     NaN   \n",
       "4            False         False             119833                     NaN   \n",
       "\n",
       "  rehabilitator  \n",
       "0           NaN  \n",
       "1           NaN  \n",
       "2           NaN  \n",
       "3           NaN  \n",
       "4           NaN  \n",
       "\n",
       "[5 rows x 22 columns]"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Contenido JSON\n",
    "#Fuente: https://data.cityofnewyork.us/Environment/Urban-Park-Ranger-Animal-Condition-Response/fuhs-xmg2\n",
    "url = 'https://data.cityofnewyork.us/resource/s3vf-x992.json'\n",
    "df = pd.read_json(url)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Actividades\n",
    "\n",
    "En esta actividad practicaremos la carga de datos en diferentes formatos. En el mundo real, los datos no siempre tienen una estructura y un formato como desearíamos.\n",
    "\n",
    "El objetivo es que analices la carga de estos datos con los datos originales: <br/>\n",
    "- ¿Qué dimensión tienen los datos reales y los cargados?\n",
    "- ¿Cuáles son las columnas?\n",
    "- ¿El concepto de columna como atributo o característica y el concepto de fila como muestra están presentes en la estructura de los datos?\n",
    "- ¿Coinciden con la información del archivo?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 1. Descarga y carga el siguiente fichero:\n",
    "# https://data.cityofnewyork.us/Housing-Development/Speculation-Watch-List/adax-9mit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 2. Descarga y carga el siguiente fichero:\n",
    "# https://ibestat.es/estadistica/demografia/moviment-natural-de-la-poblacio/naixements-2/?lang=ca\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 3. Y ahora Descargalo y abrelo comprimido!!!. Es decir, sin descomprimir!\n",
    "# https://ec.europa.eu/eurostat/databrowser/view/tin00171/default/table?lang=en\n",
    "# Los archivos comprimidos en formato .gz se pueden abrir directamente como si fueran archivos de datos con Pandas, y en este caso, son archivos del tipo CSV. Es decir, no es necesario descomprimirlos."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Gestión de *dataframes*\n",
    "\n",
    "Durante este curso aprenderemos a modificar los dataframes, agregaremos y eliminaremos columnas, y también modificaremos las que ya tenemos. Por lo tanto, después de realizar este trabajo, es necesario guardar los nuevos datos en un archivo."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_gastos.to_csv('data/tmp_file.csv',encoding='utf-8') # guardant un dataframe en un fitxer, especificant el format"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### Estructura del dataframe\n",
    "\n",
    "Ahora que ya sabemos cargar dataframes desde archivos, descubriremos cómo podemos acceder a la información que se encuentra dentro de los tipos de variables utilizados por Pandas.\n",
    "\n",
    "Un dataframe tiene columnas y filas. Las filas son muestras y las columnas representan características de una muestra. Una columna es del tipo Serie.\n",
    "\n",
    "**El objetivo de esta unidad es adquirir herramientas para comprender y seleccionar los datos representados en un dataframe y una serie con Pandas.**\n",
    "\n",
    "Comenzaremos seleccionando columnas y obteniendo resúmenes estadísticos de ellas. Más adelante, pasaremos a realizar selecciones de filas en el dataframe. Finalmente, realizaremos selecciones combinadas creando nuestros propios dataframes a partir de los subconjuntos seleccionados.\"\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "df_who = pd.read_csv(\"http://www.exploredata.net/ftp/WHO.csv\") #dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 202 entries, 0 to 201\n",
      "Columns: 358 entries, Country to Urban_population_pct_of_total\n",
      "dtypes: float64(355), int64(2), object(1)\n",
      "memory usage: 565.1+ KB\n"
     ]
    }
   ],
   "source": [
    "df_who.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>Population in urban areas (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>202.000000</td>\n",
       "      <td>202.000000</td>\n",
       "      <td>177.000000</td>\n",
       "      <td>131.000000</td>\n",
       "      <td>178.000000</td>\n",
       "      <td>179.000000</td>\n",
       "      <td>179.000000</td>\n",
       "      <td>1.930000e+02</td>\n",
       "      <td>193.000000</td>\n",
       "      <td>193.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>1.860000e+02</td>\n",
       "      <td>1.780000e+02</td>\n",
       "      <td>128.000000</td>\n",
       "      <td>1.710000e+02</td>\n",
       "      <td>181.000000</td>\n",
       "      <td>170.000000</td>\n",
       "      <td>181.000000</td>\n",
       "      <td>1.880000e+02</td>\n",
       "      <td>188.000000</td>\n",
       "      <td>188.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>101.500000</td>\n",
       "      <td>3.579208</td>\n",
       "      <td>59.457627</td>\n",
       "      <td>78.871756</td>\n",
       "      <td>11250.112360</td>\n",
       "      <td>84.033520</td>\n",
       "      <td>85.698324</td>\n",
       "      <td>3.409805e+04</td>\n",
       "      <td>1.297927</td>\n",
       "      <td>54.911917</td>\n",
       "      <td>...</td>\n",
       "      <td>1.483596e+05</td>\n",
       "      <td>2.015567e+11</td>\n",
       "      <td>57.253516</td>\n",
       "      <td>3.424012e+08</td>\n",
       "      <td>56.677624</td>\n",
       "      <td>54.356471</td>\n",
       "      <td>56.677624</td>\n",
       "      <td>1.665763e+07</td>\n",
       "      <td>2.165851</td>\n",
       "      <td>55.195213</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>58.456537</td>\n",
       "      <td>1.808263</td>\n",
       "      <td>49.105286</td>\n",
       "      <td>20.415760</td>\n",
       "      <td>12586.753417</td>\n",
       "      <td>17.788047</td>\n",
       "      <td>15.451212</td>\n",
       "      <td>1.304957e+05</td>\n",
       "      <td>1.163864</td>\n",
       "      <td>23.554182</td>\n",
       "      <td>...</td>\n",
       "      <td>6.133091e+05</td>\n",
       "      <td>9.400689e+11</td>\n",
       "      <td>138.669298</td>\n",
       "      <td>5.943043e+10</td>\n",
       "      <td>60.060929</td>\n",
       "      <td>61.160556</td>\n",
       "      <td>60.060929</td>\n",
       "      <td>5.094867e+07</td>\n",
       "      <td>1.596628</td>\n",
       "      <td>23.742122</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>23.600000</td>\n",
       "      <td>260.000000</td>\n",
       "      <td>6.000000</td>\n",
       "      <td>11.000000</td>\n",
       "      <td>2.000000e+00</td>\n",
       "      <td>-2.500000</td>\n",
       "      <td>10.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>2.565000e+01</td>\n",
       "      <td>5.190000e+07</td>\n",
       "      <td>0.990000</td>\n",
       "      <td>-7.140000e+11</td>\n",
       "      <td>2.900000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>2.900000</td>\n",
       "      <td>1.545600e+04</td>\n",
       "      <td>-1.160000</td>\n",
       "      <td>10.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>51.250000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>19.000000</td>\n",
       "      <td>68.400000</td>\n",
       "      <td>2112.500000</td>\n",
       "      <td>79.000000</td>\n",
       "      <td>79.500000</td>\n",
       "      <td>1.340000e+03</td>\n",
       "      <td>0.500000</td>\n",
       "      <td>36.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>1.672615e+03</td>\n",
       "      <td>3.317500e+09</td>\n",
       "      <td>16.292500</td>\n",
       "      <td>-1.210000e+09</td>\n",
       "      <td>12.400000</td>\n",
       "      <td>8.475000</td>\n",
       "      <td>12.400000</td>\n",
       "      <td>9.171623e+05</td>\n",
       "      <td>1.105000</td>\n",
       "      <td>35.650000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>101.500000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>46.000000</td>\n",
       "      <td>86.500000</td>\n",
       "      <td>6175.000000</td>\n",
       "      <td>90.000000</td>\n",
       "      <td>90.000000</td>\n",
       "      <td>6.762000e+03</td>\n",
       "      <td>1.300000</td>\n",
       "      <td>57.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>1.021157e+04</td>\n",
       "      <td>1.145000e+10</td>\n",
       "      <td>28.515000</td>\n",
       "      <td>-2.240000e+08</td>\n",
       "      <td>29.980000</td>\n",
       "      <td>27.600000</td>\n",
       "      <td>29.980000</td>\n",
       "      <td>3.427661e+06</td>\n",
       "      <td>1.945000</td>\n",
       "      <td>57.300000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>151.750000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>91.000000</td>\n",
       "      <td>95.300000</td>\n",
       "      <td>14502.500000</td>\n",
       "      <td>96.000000</td>\n",
       "      <td>96.000000</td>\n",
       "      <td>2.173200e+04</td>\n",
       "      <td>2.100000</td>\n",
       "      <td>73.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>6.549217e+04</td>\n",
       "      <td>8.680000e+10</td>\n",
       "      <td>55.310000</td>\n",
       "      <td>1.024000e+09</td>\n",
       "      <td>88.700000</td>\n",
       "      <td>82.900000</td>\n",
       "      <td>88.700000</td>\n",
       "      <td>9.837113e+06</td>\n",
       "      <td>3.252500</td>\n",
       "      <td>72.750000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>202.000000</td>\n",
       "      <td>7.000000</td>\n",
       "      <td>199.000000</td>\n",
       "      <td>99.800000</td>\n",
       "      <td>60870.000000</td>\n",
       "      <td>100.000000</td>\n",
       "      <td>100.000000</td>\n",
       "      <td>1.328474e+06</td>\n",
       "      <td>4.300000</td>\n",
       "      <td>100.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>5.776432e+06</td>\n",
       "      <td>1.100000e+13</td>\n",
       "      <td>1334.860000</td>\n",
       "      <td>1.390000e+11</td>\n",
       "      <td>267.000000</td>\n",
       "      <td>253.700000</td>\n",
       "      <td>267.000000</td>\n",
       "      <td>5.270000e+08</td>\n",
       "      <td>7.850000</td>\n",
       "      <td>100.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>8 rows × 357 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        CountryID   Continent  Adolescent fertility rate (%)  \\\n",
       "count  202.000000  202.000000                     177.000000   \n",
       "mean   101.500000    3.579208                      59.457627   \n",
       "std     58.456537    1.808263                      49.105286   \n",
       "min      1.000000    1.000000                       0.000000   \n",
       "25%     51.250000    2.000000                      19.000000   \n",
       "50%    101.500000    3.000000                      46.000000   \n",
       "75%    151.750000    5.000000                      91.000000   \n",
       "max    202.000000    7.000000                     199.000000   \n",
       "\n",
       "       Adult literacy rate (%)  \\\n",
       "count               131.000000   \n",
       "mean                 78.871756   \n",
       "std                  20.415760   \n",
       "min                  23.600000   \n",
       "25%                  68.400000   \n",
       "50%                  86.500000   \n",
       "75%                  95.300000   \n",
       "max                  99.800000   \n",
       "\n",
       "       Gross national income per capita (PPP international $)  \\\n",
       "count                                         178.000000        \n",
       "mean                                        11250.112360        \n",
       "std                                         12586.753417        \n",
       "min                                           260.000000        \n",
       "25%                                          2112.500000        \n",
       "50%                                          6175.000000        \n",
       "75%                                         14502.500000        \n",
       "max                                         60870.000000        \n",
       "\n",
       "       Net primary school enrolment ratio female (%)  \\\n",
       "count                                     179.000000   \n",
       "mean                                       84.033520   \n",
       "std                                        17.788047   \n",
       "min                                         6.000000   \n",
       "25%                                        79.000000   \n",
       "50%                                        90.000000   \n",
       "75%                                        96.000000   \n",
       "max                                       100.000000   \n",
       "\n",
       "       Net primary school enrolment ratio male (%)  \\\n",
       "count                                   179.000000   \n",
       "mean                                     85.698324   \n",
       "std                                      15.451212   \n",
       "min                                      11.000000   \n",
       "25%                                      79.500000   \n",
       "50%                                      90.000000   \n",
       "75%                                      96.000000   \n",
       "max                                     100.000000   \n",
       "\n",
       "       Population (in thousands) total  Population annual growth rate (%)  \\\n",
       "count                     1.930000e+02                         193.000000   \n",
       "mean                      3.409805e+04                           1.297927   \n",
       "std                       1.304957e+05                           1.163864   \n",
       "min                       2.000000e+00                          -2.500000   \n",
       "25%                       1.340000e+03                           0.500000   \n",
       "50%                       6.762000e+03                           1.300000   \n",
       "75%                       2.173200e+04                           2.100000   \n",
       "max                       1.328474e+06                           4.300000   \n",
       "\n",
       "       Population in urban areas (%)  ...  Total_CO2_emissions  Total_income  \\\n",
       "count                     193.000000  ...         1.860000e+02  1.780000e+02   \n",
       "mean                       54.911917  ...         1.483596e+05  2.015567e+11   \n",
       "std                        23.554182  ...         6.133091e+05  9.400689e+11   \n",
       "min                        10.000000  ...         2.565000e+01  5.190000e+07   \n",
       "25%                        36.000000  ...         1.672615e+03  3.317500e+09   \n",
       "50%                        57.000000  ...         1.021157e+04  1.145000e+10   \n",
       "75%                        73.000000  ...         6.549217e+04  8.680000e+10   \n",
       "max                       100.000000  ...         5.776432e+06  1.100000e+13   \n",
       "\n",
       "       Total_reserves  Trade_balance_goods_and_services  \\\n",
       "count      128.000000                      1.710000e+02   \n",
       "mean        57.253516                      3.424012e+08   \n",
       "std        138.669298                      5.943043e+10   \n",
       "min          0.990000                     -7.140000e+11   \n",
       "25%         16.292500                     -1.210000e+09   \n",
       "50%         28.515000                     -2.240000e+08   \n",
       "75%         55.310000                      1.024000e+09   \n",
       "max       1334.860000                      1.390000e+11   \n",
       "\n",
       "       Under_five_mortality_from_CME  Under_five_mortality_from_IHME  \\\n",
       "count                     181.000000                      170.000000   \n",
       "mean                       56.677624                       54.356471   \n",
       "std                        60.060929                       61.160556   \n",
       "min                         2.900000                        3.000000   \n",
       "25%                        12.400000                        8.475000   \n",
       "50%                        29.980000                       27.600000   \n",
       "75%                        88.700000                       82.900000   \n",
       "max                       267.000000                      253.700000   \n",
       "\n",
       "       Under_five_mortality_rate  Urban_population  Urban_population_growth  \\\n",
       "count                 181.000000      1.880000e+02               188.000000   \n",
       "mean                   56.677624      1.665763e+07                 2.165851   \n",
       "std                    60.060929      5.094867e+07                 1.596628   \n",
       "min                     2.900000      1.545600e+04                -1.160000   \n",
       "25%                    12.400000      9.171623e+05                 1.105000   \n",
       "50%                    29.980000      3.427661e+06                 1.945000   \n",
       "75%                    88.700000      9.837113e+06                 3.252500   \n",
       "max                   267.000000      5.270000e+08                 7.850000   \n",
       "\n",
       "       Urban_population_pct_of_total  \n",
       "count                     188.000000  \n",
       "mean                       55.195213  \n",
       "std                        23.742122  \n",
       "min                        10.000000  \n",
       "25%                        35.650000  \n",
       "50%                        57.300000  \n",
       "75%                        72.750000  \n",
       "max                       100.000000  \n",
       "\n",
       "[8 rows x 357 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Columnes\n",
    "\n",
    "Com hem comentat a l'introducció, començarem amb les columnes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>151.0</td>\n",
       "      <td>28.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>26088.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>...</td>\n",
       "      <td>692.50</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>257.00</td>\n",
       "      <td>231.9</td>\n",
       "      <td>257.00</td>\n",
       "      <td>5740436.0</td>\n",
       "      <td>5.44</td>\n",
       "      <td>22.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>27.0</td>\n",
       "      <td>98.7</td>\n",
       "      <td>6000.0</td>\n",
       "      <td>93.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>3172.0</td>\n",
       "      <td>0.6</td>\n",
       "      <td>...</td>\n",
       "      <td>3499.12</td>\n",
       "      <td>4.790000e+09</td>\n",
       "      <td>78.14</td>\n",
       "      <td>-2.040000e+09</td>\n",
       "      <td>18.47</td>\n",
       "      <td>15.5</td>\n",
       "      <td>18.47</td>\n",
       "      <td>1431793.9</td>\n",
       "      <td>2.21</td>\n",
       "      <td>45.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>6.0</td>\n",
       "      <td>69.9</td>\n",
       "      <td>5940.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>96.0</td>\n",
       "      <td>33351.0</td>\n",
       "      <td>1.5</td>\n",
       "      <td>...</td>\n",
       "      <td>137535.56</td>\n",
       "      <td>6.970000e+10</td>\n",
       "      <td>351.36</td>\n",
       "      <td>4.700000e+09</td>\n",
       "      <td>40.00</td>\n",
       "      <td>31.2</td>\n",
       "      <td>40.00</td>\n",
       "      <td>20800000.0</td>\n",
       "      <td>2.61</td>\n",
       "      <td>63.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Andorra</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>83.0</td>\n",
       "      <td>83.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Angola</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>146.0</td>\n",
       "      <td>67.4</td>\n",
       "      <td>3890.0</td>\n",
       "      <td>49.0</td>\n",
       "      <td>51.0</td>\n",
       "      <td>16557.0</td>\n",
       "      <td>2.8</td>\n",
       "      <td>...</td>\n",
       "      <td>8991.46</td>\n",
       "      <td>1.490000e+10</td>\n",
       "      <td>27.13</td>\n",
       "      <td>9.140000e+09</td>\n",
       "      <td>164.10</td>\n",
       "      <td>242.5</td>\n",
       "      <td>164.10</td>\n",
       "      <td>8578749.0</td>\n",
       "      <td>4.14</td>\n",
       "      <td>53.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "0  Afghanistan          1          1                          151.0   \n",
       "1      Albania          2          2                           27.0   \n",
       "2      Algeria          3          3                            6.0   \n",
       "3      Andorra          4          2                            NaN   \n",
       "4       Angola          5          3                          146.0   \n",
       "\n",
       "   Adult literacy rate (%)  \\\n",
       "0                     28.0   \n",
       "1                     98.7   \n",
       "2                     69.9   \n",
       "3                      NaN   \n",
       "4                     67.4   \n",
       "\n",
       "   Gross national income per capita (PPP international $)  \\\n",
       "0                                                NaN        \n",
       "1                                             6000.0        \n",
       "2                                             5940.0        \n",
       "3                                                NaN        \n",
       "4                                             3890.0        \n",
       "\n",
       "   Net primary school enrolment ratio female (%)  \\\n",
       "0                                            NaN   \n",
       "1                                           93.0   \n",
       "2                                           94.0   \n",
       "3                                           83.0   \n",
       "4                                           49.0   \n",
       "\n",
       "   Net primary school enrolment ratio male (%)  \\\n",
       "0                                          NaN   \n",
       "1                                         94.0   \n",
       "2                                         96.0   \n",
       "3                                         83.0   \n",
       "4                                         51.0   \n",
       "\n",
       "   Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "0                          26088.0                                4.0  ...   \n",
       "1                           3172.0                                0.6  ...   \n",
       "2                          33351.0                                1.5  ...   \n",
       "3                             74.0                                1.0  ...   \n",
       "4                          16557.0                                2.8  ...   \n",
       "\n",
       "   Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "0               692.50           NaN             NaN   \n",
       "1              3499.12  4.790000e+09           78.14   \n",
       "2            137535.56  6.970000e+10          351.36   \n",
       "3                  NaN           NaN             NaN   \n",
       "4              8991.46  1.490000e+10           27.13   \n",
       "\n",
       "   Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "0                               NaN                         257.00   \n",
       "1                     -2.040000e+09                          18.47   \n",
       "2                      4.700000e+09                          40.00   \n",
       "3                               NaN                            NaN   \n",
       "4                      9.140000e+09                         164.10   \n",
       "\n",
       "   Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "0                           231.9                     257.00   \n",
       "1                            15.5                      18.47   \n",
       "2                            31.2                      40.00   \n",
       "3                             NaN                        NaN   \n",
       "4                           242.5                     164.10   \n",
       "\n",
       "   Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "0         5740436.0                     5.44                           22.9  \n",
       "1         1431793.9                     2.21                           45.4  \n",
       "2        20800000.0                     2.61                           63.3  \n",
       "3               NaN                      NaN                            NaN  \n",
       "4         8578749.0                     4.14                           53.3  \n",
       "\n",
       "[5 rows x 358 columns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(202, 358)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Index(['Country', 'CountryID', 'Continent', 'Adolescent fertility rate (%)',\n",
      "       'Adult literacy rate (%)',\n",
      "       'Gross national income per capita (PPP international $)',\n",
      "       'Net primary school enrolment ratio female (%)',\n",
      "       'Net primary school enrolment ratio male (%)',\n",
      "       'Population (in thousands) total', 'Population annual growth rate (%)',\n",
      "       ...\n",
      "       'Total_CO2_emissions', 'Total_income', 'Total_reserves',\n",
      "       'Trade_balance_goods_and_services', 'Under_five_mortality_from_CME',\n",
      "       'Under_five_mortality_from_IHME', 'Under_five_mortality_rate',\n",
      "       'Urban_population', 'Urban_population_growth',\n",
      "       'Urban_population_pct_of_total'],\n",
      "      dtype='object', length=358)\n"
     ]
    }
   ],
   "source": [
    "# Columnas o características de cada muestra\n",
    "print(df_who.columns)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ix:0\tlabel:Country\n",
      "ix:1\tlabel:CountryID\n",
      "ix:2\tlabel:Continent\n",
      "ix:3\tlabel:Adolescent fertility rate (%)\n",
      "ix:4\tlabel:Adult literacy rate (%)\n",
      "ix:5\tlabel:Gross national income per capita (PPP international $)\n",
      "ix:6\tlabel:Net primary school enrolment ratio female (%)\n",
      "ix:7\tlabel:Net primary school enrolment ratio male (%)\n",
      "ix:8\tlabel:Population (in thousands) total\n",
      "ix:9\tlabel:Population annual growth rate (%)\n",
      "ix:10\tlabel:Population in urban areas (%)\n",
      "ix:11\tlabel:Population living below the poverty line (% living on &lt; US$1 per day)\n",
      "ix:12\tlabel:Population median age (years)\n",
      "ix:13\tlabel:Population proportion over 60 (%)\n",
      "ix:14\tlabel:Population proportion under 15 (%)\n",
      "ix:15\tlabel:Registration coverage of births (%)\n",
      "ix:16\tlabel:Total fertility rate (per woman)\n",
      "ix:17\tlabel:Antenatal care coverage - at least four visits (%)\n",
      "ix:18\tlabel:Antiretroviral therapy coverage among HIV-infected pregt women for PMTCT (%)\n",
      "ix:19\tlabel:Antiretroviral therapy coverage among people with advanced HIV infections (%)\n",
      "ix:20\tlabel:Births attended by skilled health personnel (%)\n",
      "ix:21\tlabel:Births by caesarean section (%)\n",
      "ix:22\tlabel:Children aged 6-59 months who received vitamin A supplementation (%)\n",
      "ix:23\tlabel:Children aged &lt;5 years sleeping under insecticide-treated nets (%)\n",
      "ix:24\tlabel:Children aged &lt;5 years who received any antimalarial treatment for fever (%)\n",
      "ix:25\tlabel:Children aged &lt;5 years with ARI symptoms taken to facility (%)\n",
      "ix:26\tlabel:Children aged &lt;5 years with diarrhoea receiving ORT (%)\n",
      "ix:27\tlabel:Contraceptive prevalence (%)\n",
      "ix:28\tlabel:Neonates protected at birth against neonatal tetanus (PAB) (%)\n",
      "ix:29\tlabel:One-year-olds immunized with MCV\n",
      "ix:30\tlabel:One-year-olds immunized with three doses of diphtheria tetanus toxoid and pertussis (DTP3) (%)\n",
      "ix:31\tlabel:One-year-olds immunized with three doses of Hepatitis B (HepB3) (%)\n",
      "ix:32\tlabel:One-year-olds immunized with three doses of Hib (Hib3) vaccine (%)\n",
      "ix:33\tlabel:Tuberculosis detection rate under DOTS (%)\n",
      "ix:34\tlabel:Tuberculosis treatment success under DOTS (%)\n",
      "ix:35\tlabel:Women who have had mammography (%)\n",
      "ix:36\tlabel:Women who have had PAP smear (%)\n",
      "ix:37\tlabel:Community and traditional health workers density (per 10 000 population)\n",
      "ix:38\tlabel:Dentistry personnel density (per 10 000 population)\n",
      "ix:39\tlabel:Environment and public health workers density (per 10 000 population)\n",
      "ix:40\tlabel:External resources for health as percentage of total expenditure on health\n",
      "ix:41\tlabel:General government expenditure on health as percentage of total expenditure on health\n",
      "ix:42\tlabel:General government expenditure on health as percentage of total government expenditure\n",
      "ix:43\tlabel:Hospital beds (per 10 000 population)\n",
      "ix:44\tlabel:Laboratory health workers density (per 10 000 population)\n",
      "ix:45\tlabel:Number of community and traditional health workers\n",
      "ix:46\tlabel:Number of dentistry personnel\n",
      "ix:47\tlabel:Number of environment and public health workers\n",
      "ix:48\tlabel:Number of laboratory health workers\n",
      "ix:49\tlabel:Number of nursing and midwifery personnel\n",
      "ix:50\tlabel:Number of other health service providers\n",
      "ix:51\tlabel:Number of pharmaceutical personnel\n",
      "ix:52\tlabel:Number of physicians\n",
      "ix:53\tlabel:Nursing and midwifery personnel density (per 10 000 population)\n",
      "ix:54\tlabel:Other health service providers density (per 10 000 population)\n",
      "ix:55\tlabel:Out-of-pocket expenditure as percentage of private expenditure on health\n",
      "ix:56\tlabel:Per capita government expenditure on health (PPP int. $)\n",
      "ix:57\tlabel:Per capita government expenditure on health at average exchange rate (US$)\n",
      "ix:58\tlabel:Per capita total expenditure on health (PPP int. $)\n",
      "ix:59\tlabel:Per capita total expenditure on health at average exchange rate (US$)\n",
      "ix:60\tlabel:Pharmaceutical personnel density (per 10 000 population)\n",
      "ix:61\tlabel:Physicians density (per 10 000 population)\n",
      "ix:62\tlabel:Private expenditure on health as percentage of total expenditure on health\n",
      "ix:63\tlabel:Private prepaid plans as percentage of private expenditure on health\n",
      "ix:64\tlabel:Ratio of health management and support workers to health service providers\n",
      "ix:65\tlabel:Ratio of nurses and midwives to physicians\n",
      "ix:66\tlabel:Social security expenditure on health as percentage of general government expenditure on health\n",
      "ix:67\tlabel:Total expenditure on health as percentage of gross domestic product\n",
      "ix:68\tlabel:Births attended by skilled health personnel (%) highest educational level of mother\n",
      "ix:69\tlabel:Births attended by skilled health personnel (%) highest wealth quintile\n",
      "ix:70\tlabel:Births attended by skilled health personnel (%) lowest educational level of mother\n",
      "ix:71\tlabel:Births attended by skilled health personnel (%) lowest wealth quintile\n",
      "ix:72\tlabel:Births attended by skilled health personnel (%) rural\n",
      "ix:73\tlabel:Births attended by skilled health personnel (%) urban\n",
      "ix:74\tlabel:Births attended by skilled health personnel difference highest lowest educational level of mother\n",
      "ix:75\tlabel:Births attended by skilled health personnel difference highest-lowest wealth quintile\n",
      "ix:76\tlabel:Births attended by skilled health personnel difference urban-rural\n",
      "ix:77\tlabel:Births attended by skilled health personnel ratio highest-lowest educational level of mother\n",
      "ix:78\tlabel:Births attended by skilled health personnel ratio highest-lowest wealth quintile\n",
      "ix:79\tlabel:Births attended by skilled health personnel ratio urban-rural\n",
      "ix:80\tlabel:Measles immunization coverage among one-year-olds (%) highest educational level of mother\n",
      "ix:81\tlabel:Measles immunization coverage among one-year-olds (%) highest wealth quintile\n",
      "ix:82\tlabel:Measles immunization coverage among one-year-olds (%) lowest educational level of mother\n",
      "ix:83\tlabel:Measles immunization coverage among one-year-olds (%) lowest wealth quintile\n",
      "ix:84\tlabel:Measles immunization coverage among one-year-olds (%) rural\n",
      "ix:85\tlabel:Measles immunization coverage among one-year-olds (%) urban\n",
      "ix:86\tlabel:Measles immunization coverage among one-year-olds difference highest-lowest educational level of mother\n",
      "ix:87\tlabel:Measles immunization coverage among one-year-olds difference highest-lowest wealth quintile\n",
      "ix:88\tlabel:Measles immunization coverage among one-year-olds difference urban-rural\n",
      "ix:89\tlabel:Measles immunization coverage among one-year-olds ratio highest-lowest educational level of mother\n",
      "ix:90\tlabel:Measles immunization coverage among one-year-olds ratio highest-lowest wealth quintile\n",
      "ix:91\tlabel:Measles immunization coverage among one-year-olds ratio urban-rural\n",
      "ix:92\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) difference lowest-highest educational level of mother\n",
      "ix:93\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) difference lowest-highest wealth quintile\n",
      "ix:94\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) difference rural-urban\n",
      "ix:95\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) highest educational level of mother\n",
      "ix:96\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) highest wealth quintile\n",
      "ix:97\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) lowest educational level of mother\n",
      "ix:98\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) lowest wealth quintile\n",
      "ix:99\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) ratio lowest-highest educational level of mother\n",
      "ix:100\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) ratio lowest-highest wealth quintile\n",
      "ix:101\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) ratio rural-urban\n",
      "ix:102\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) rural\n",
      "ix:103\tlabel:Under-5 mortality rate (Probability of dying aged &lt; 5 years per 1 000 live births) urban\n",
      "ix:104\tlabel:Adult mortality rate (probability of dying between 15 to 60 years per 1000 population) both sexes\n",
      "ix:105\tlabel:Adult mortality rate (probability of dying between 15 to 60 years per 1000 population) female\n",
      "ix:106\tlabel:Adult mortality rate (probability of dying between 15 to 60 years per 1000 population) male\n",
      "ix:107\tlabel:Age-standardized mortality rate for cancer (per 100 000 population)\n",
      "ix:108\tlabel:Age-standardized mortality rate for cardiovascular diseases (per 100 000 population)\n",
      "ix:109\tlabel:Age-standardized mortality rate for injuries (per 100 000 population)\n",
      "ix:110\tlabel:Age-standardized mortality rate for non-communicable diseases (per 100 000 population)\n",
      "ix:111\tlabel:Deaths among children under five years of age due to diarrhoeal diseases (%)\n",
      "ix:112\tlabel:Deaths among children under five years of age due to HIV/AIDS (%)\n",
      "ix:113\tlabel:Deaths among children under five years of age due to injuries (%)\n",
      "ix:114\tlabel:Deaths among children under five years of age due to malaria (%)\n",
      "ix:115\tlabel:Deaths among children under five years of age due to measles (%)\n",
      "ix:116\tlabel:Deaths among children under five years of age due to neonatal causes (%)\n",
      "ix:117\tlabel:Deaths among children under five years of age due to other causes (%)\n",
      "ix:118\tlabel:Deaths among children under five years of age due to pneumonia (%)\n",
      "ix:119\tlabel:Deaths due to HIV/AIDS (per 100 000 population per year)\n",
      "ix:120\tlabel:Deaths due to tuberculosis among HIV-negative people (per 100 000 population)\n",
      "ix:121\tlabel:Deaths due to tuberculosis among HIV-positive people (per 100 000 population)\n",
      "ix:122\tlabel:Healthy life expectancy (HALE) at birth (years) both sexes\n",
      "ix:123\tlabel:Healthy life expectancy (HALE) at birth (years) female\n",
      "ix:124\tlabel:Healthy life expectancy (HALE) at birth (years) male\n",
      "ix:125\tlabel:Incidence of tuberculosis (per 100 000 population per year)\n",
      "ix:126\tlabel:Infant mortality rate (per 1 000 live births) both sexes\n",
      "ix:127\tlabel:Infant mortality rate (per 1 000 live births) female\n",
      "ix:128\tlabel:Infant mortality rate (per 1 000 live births) male\n",
      "ix:129\tlabel:Life expectancy at birth (years) both sexes\n",
      "ix:130\tlabel:Life expectancy at birth (years) female \n",
      "ix:131\tlabel:Life expectancy at birth (years) male\n",
      "ix:132\tlabel:Maternal mortality ratio (per 100 000 live births)\n",
      "ix:133\tlabel:Neonatal mortality rate (per 1 000 live births)\n",
      "ix:134\tlabel:Number of confirmed poliomyelitis cases\n",
      "ix:135\tlabel:Prevalence of HIV among adults aged &gt;=15 years (per 100 000 population)\n",
      "ix:136\tlabel:Prevalence of tuberculosis (per 100 000 population)\n",
      "ix:137\tlabel:Under-5 mortality rate (probability of dying by age 5 per 1000 live births) both sexes\n",
      "ix:138\tlabel:Under-5 mortality rate (probability of dying by age 5 per 1000 live births) female\n",
      "ix:139\tlabel:Under-5 mortality rate (probability of dying by age 5 per 1000 live births) male\n",
      "ix:140\tlabel:Years of life lost to communicable diseases (%)\n",
      "ix:141\tlabel:Years of life lost to injuries (%)\n",
      "ix:142\tlabel:Years of life lost to non-communicable diseases (%)\n",
      "ix:143\tlabel:Children under five years of age overweight for age (%)\n",
      "ix:144\tlabel:Children under five years of age stunted for age (%)\n",
      "ix:145\tlabel:Children under five years of age underweight for age (%)\n",
      "ix:146\tlabel:Newborns with low birth weight (%)\n",
      "ix:147\tlabel:Per capita recorded alcohol consumption (litres of pure alcohol) among adults (&gt;=15 years)\n",
      "ix:148\tlabel:Population using solid fuels (%) rural\n",
      "ix:149\tlabel:Population using solid fuels (%) urban\n",
      "ix:150\tlabel:Population with sustainable access to improved drinking water sources (%) rural\n",
      "ix:151\tlabel:Population with sustainable access to improved drinking water sources (%) total\n",
      "ix:152\tlabel:Population with sustainable access to improved drinking water sources (%) urban\n",
      "ix:153\tlabel:Population with sustainable access to improved sanitation (%) rural\n",
      "ix:154\tlabel:Population with sustainable access to improved sanitation (%) total\n",
      "ix:155\tlabel:Population with sustainable access to improved sanitation (%) urban\n",
      "ix:156\tlabel:Prevalence of adults (&gt;=15 years) who are obese (%) female\n",
      "ix:157\tlabel:Prevalence of adults (&gt;=15 years) who are obese (%) male\n",
      "ix:158\tlabel:Prevalence of condom use by young people (15-24 years) at higher risk sex (%) female\n",
      "ix:159\tlabel:Prevalence of condom use by young people (15-24 years) at higher risk sex (%) male\n",
      "ix:160\tlabel:Prevalence of current tobacco use among adolescents (13-15 years) (%) both sexes\n",
      "ix:161\tlabel:Prevalence of current tobacco use among adolescents (13-15 years) (%) female\n",
      "ix:162\tlabel:Prevalence of current tobacco use among adolescents (13-15 years) (%) male\n",
      "ix:163\tlabel:Prevalence of current tobacco use among adults (&gt;=15 years) (%) both sexes\n",
      "ix:164\tlabel:Prevalence of current tobacco use among adults (&gt;=15 years) (%) female\n",
      "ix:165\tlabel:Prevalence of current tobacco use among adults (&gt;=15 years) (%) male\n",
      "ix:166\tlabel:Adolescent_fertility_rate\n",
      "ix:167\tlabel:Agricultural_land\n",
      "ix:168\tlabel:Agriculture_contribution_to_economy\n",
      "ix:169\tlabel:Aid_given\n",
      "ix:170\tlabel:Aid_received\n",
      "ix:171\tlabel:Aid_received_total\n",
      "ix:172\tlabel:All_forms_of_TB_new_cases_per_100_000_estimated\n",
      "ix:173\tlabel:All_forms_of_TB_new_cases_per_100_000_reported\n",
      "ix:174\tlabel:Annual_freshwater_withdrawals_total\n",
      "ix:175\tlabel:Arms_exports\n",
      "ix:176\tlabel:Arms_imports\n",
      "ix:177\tlabel:Bad_teeth_per_child\n",
      "ix:178\tlabel:Births_attended_by_skilled_health_staff\n",
      "ix:179\tlabel:Breast_cancer_deaths_per_100_000_women\n",
      "ix:180\tlabel:Breast_cancer_new_cases_per_100_000_women\n",
      "ix:181\tlabel:Breast_cancer_number_of_female_deaths\n",
      "ix:182\tlabel:Breast_cancer_number_of_new_female_cases\n",
      "ix:183\tlabel:Broadband_subscribers\n",
      "ix:184\tlabel:Broadband_subscribers_per_100_people\n",
      "ix:185\tlabel:CO2_emissions\n",
      "ix:186\tlabel:CO2_intensity_of_economic_output\n",
      "ix:187\tlabel:Capital_formation\n",
      "ix:188\tlabel:Cell_phones_per_100_people\n",
      "ix:189\tlabel:Cell_phones_total\n",
      "ix:190\tlabel:Central_bank_discount_rate\n",
      "ix:191\tlabel:Cervical_cancer_deaths_per_100_000_women\n",
      "ix:192\tlabel:Cervical_cancer_new_cases_per_100_000_women\n",
      "ix:193\tlabel:Cervical_cancer_number_of_female_deaths\n",
      "ix:194\tlabel:Cervical_cancer_number_of_new_female_cases\n",
      "ix:195\tlabel:Children_and_elderly\n",
      "ix:196\tlabel:Children_out_of_school_primary\n",
      "ix:197\tlabel:Children_out_of_school_primary_female\n",
      "ix:198\tlabel:Children_out_of_school_primary_male\n",
      "ix:199\tlabel:Children_per_woman\n",
      "ix:200\tlabel:Coal_consumption\n",
      "ix:201\tlabel:Coal_consumption_per_person\n",
      "ix:202\tlabel:Coal_production\n",
      "ix:203\tlabel:Coal_production_per_person\n",
      "ix:204\tlabel:Colon_and_Rectum_cancer_deaths_per_100_000_men\n",
      "ix:205\tlabel:Colon_and_Rectum_cancer_deaths_per_100_000_women\n",
      "ix:206\tlabel:Colon_and_Rectum_cancer_new_cases_per_100_000_men\n",
      "ix:207\tlabel:Colon_and_Rectum_cancer_new_cases_per_100_000_women\n",
      "ix:208\tlabel:Colon_and_Rectum_cancer_number_of_female_deaths\n",
      "ix:209\tlabel:Colon_and_Rectum_cancer_number_of_male_deaths\n",
      "ix:210\tlabel:Colon_and_Rectum_cancer_number_of_new_female_cases\n",
      "ix:211\tlabel:Colon_and_Rectum_cancer_number_of_new_male_cases\n",
      "ix:212\tlabel:Consumer_price_index\n",
      "ix:213\tlabel:Contraceptive_use\n",
      "ix:214\tlabel:Deaths_from_TB_per_100_000_estimated\n",
      "ix:215\tlabel:Debt_servicing_costs\n",
      "ix:216\tlabel:Democracy_score\n",
      "ix:217\tlabel:Electric_power_consumption\n",
      "ix:218\tlabel:Electricity_generation\n",
      "ix:219\tlabel:Electricity_generation_per_person\n",
      "ix:220\tlabel:Energy_use\n",
      "ix:221\tlabel:Expenditure_per_student_primary\n",
      "ix:222\tlabel:Expenditure_per_student_secondary\n",
      "ix:223\tlabel:Expenditure_per_student_tertiary\n",
      "ix:224\tlabel:Exports_of_goods_and_services\n",
      "ix:225\tlabel:Exports_unit_value\n",
      "ix:226\tlabel:External_debt_total_DOD_current_USdollars\n",
      "ix:227\tlabel:External_debt_total_pct_of_GNI\n",
      "ix:228\tlabel:Female_labour_force\n",
      "ix:229\tlabel:Fixed_line_and_mobile_phone_subscribers\n",
      "ix:230\tlabel:Foreign_direct_investment_net_inflows\n",
      "ix:231\tlabel:Foreign_direct_investment_net_outflows\n",
      "ix:232\tlabel:Forest_area\n",
      "ix:233\tlabel:Gross_capital_formation\n",
      "ix:234\tlabel:HIV_infected\n",
      "ix:235\tlabel:Health_expenditure_per_person\n",
      "ix:236\tlabel:Health_expenditure_private\n",
      "ix:237\tlabel:Health_expenditure_public_pct_of_GDP\n",
      "ix:238\tlabel:Health_expenditure_public_pct_of_government_expenditure\n",
      "ix:239\tlabel:Health_expenditure_public_pct_of_total_health_expenditure\n",
      "ix:240\tlabel:Health_expenditure_total\n",
      "ix:241\tlabel:High_technology_exports\n",
      "ix:242\tlabel:Hydroelectricity_consumption\n",
      "ix:243\tlabel:Hydroelectricity_consumption_per_person\n",
      "ix:244\tlabel:Imports_of_goods_and_services\n",
      "ix:245\tlabel:Imports_unit_value\n",
      "ix:246\tlabel:Improved_sanitation_facilities_urban\n",
      "ix:247\tlabel:Improved_water_source\n",
      "ix:248\tlabel:Income_growth\n",
      "ix:249\tlabel:Income_per_person\n",
      "ix:250\tlabel:Income_share_held_by_lowest_20pct\n",
      "ix:251\tlabel:Industry_contribution_to_economy\n",
      "ix:252\tlabel:Inequality_index\n",
      "ix:253\tlabel:Infant_mortality_rate\n",
      "ix:254\tlabel:Infectious_TB_new_cases_per_100_000_estimated\n",
      "ix:255\tlabel:Infectious_TB_new_cases_per_100_000_reported\n",
      "ix:256\tlabel:Infectious_TB_treatment_completeness\n",
      "ix:257\tlabel:Inflation_GDP_deflator\n",
      "ix:258\tlabel:Internet_users\n",
      "ix:259\tlabel:Life_expectancy_at_birth\n",
      "ix:260\tlabel:Literacy_rate_adult_female\n",
      "ix:261\tlabel:Literacy_rate_adult_male\n",
      "ix:262\tlabel:Literacy_rate_adult_total\n",
      "ix:263\tlabel:Literacy_rate_youth_female\n",
      "ix:264\tlabel:Literacy_rate_youth_male\n",
      "ix:265\tlabel:Literacy_rate_youth_total\n",
      "ix:266\tlabel:Liver_cancer_deaths_per_100_000_men\n",
      "ix:267\tlabel:Liver_cancer_deaths_per_100_000_women\n",
      "ix:268\tlabel:Liver_cancer_new_cases_per_100_000_men\n",
      "ix:269\tlabel:Liver_cancer_new_cases_per_100_000_women\n",
      "ix:270\tlabel:Liver_cancer_number_of_female_deaths\n",
      "ix:271\tlabel:Liver_cancer_number_of_male_deaths\n",
      "ix:272\tlabel:Liver_cancer_number_of_new_female_cases\n",
      "ix:273\tlabel:Liver_cancer_number_of_new_male_cases\n",
      "ix:274\tlabel:Lung_cancer_deaths_per_100_000_men\n",
      "ix:275\tlabel:Lung_cancer_deaths_per_100_000_women\n",
      "ix:276\tlabel:Lung_cancer_new_cases_per_100_000_men\n",
      "ix:277\tlabel:Lung_cancer_new_cases_per_100_000_women\n",
      "ix:278\tlabel:Lung_cancer_number_of_female_deaths\n",
      "ix:279\tlabel:Lung_cancer_number_of_male_deaths\n",
      "ix:280\tlabel:Lung_cancer_number_of_new_female_cases\n",
      "ix:281\tlabel:Lung_cancer_number_of_new_male_cases\n",
      "ix:282\tlabel:Malaria_prevention_insecticide_treated_bed_nets_usage\n",
      "ix:283\tlabel:Malaria_treatment\n",
      "ix:284\tlabel:Malnutrition_weight_for_age\n",
      "ix:285\tlabel:Market_value_of_listed_companies\n",
      "ix:286\tlabel:Maternal_mortality\n",
      "ix:287\tlabel:Math_achievement_4th_grade\n",
      "ix:288\tlabel:Math_achievement_8th_grade\n",
      "ix:289\tlabel:Measles_immunization\n",
      "ix:290\tlabel:Medical_Doctors\n",
      "ix:291\tlabel:Merchandise_trade\n",
      "ix:292\tlabel:Military_expenditure\n",
      "ix:293\tlabel:Natural_gas_consumption\n",
      "ix:294\tlabel:Natural_gas_consumption_per_person\n",
      "ix:295\tlabel:Natural_gas_production\n",
      "ix:296\tlabel:Natural_gas_production_per_person\n",
      "ix:297\tlabel:Natural_gas_proved_reserves\n",
      "ix:298\tlabel:Natural_gas_proven_reserves_per_person\n",
      "ix:299\tlabel:Net_barter_terms_of_trade\n",
      "ix:300\tlabel:Nuclear_consumption\n",
      "ix:301\tlabel:Nuclear_consumption_per_person\n",
      "ix:302\tlabel:Number_of_deaths_from_TB_estimated\n",
      "ix:303\tlabel:Number_of_existing_TB_cases_estimated\n",
      "ix:304\tlabel:Oil_consumption\n",
      "ix:305\tlabel:Oil_consumption_per_person\n",
      "ix:306\tlabel:Oil_production\n",
      "ix:307\tlabel:Oil_production_per_person\n",
      "ix:308\tlabel:Oil_proved_reserves\n",
      "ix:309\tlabel:Oil_proven_reserves_per_person\n",
      "ix:310\tlabel:Old_version_of_Income_per_person\n",
      "ix:311\tlabel:Patent_applications\n",
      "ix:312\tlabel:Patents_granted\n",
      "ix:313\tlabel:Patents_in_force\n",
      "ix:314\tlabel:People_living_with_HIV\n",
      "ix:315\tlabel:Personal_computers_per_100_people\n",
      "ix:316\tlabel:Personal_computers_total\n",
      "ix:317\tlabel:Population_growth\n",
      "ix:318\tlabel:Population_in_urban_agglomerations_more_than_1_million\n",
      "ix:319\tlabel:Population_total\n",
      "ix:320\tlabel:Poverty_headcount_ratio_at_national_poverty_line\n",
      "ix:321\tlabel:Present_value_of_debt\n",
      "ix:322\tlabel:Primary_completion_rate_total\n",
      "ix:323\tlabel:Primary_energy_consumption\n",
      "ix:324\tlabel:Primary_energy_consumption_per_person\n",
      "ix:325\tlabel:Primary_school_completion_pct_of_boys\n",
      "ix:326\tlabel:Primary_school_completion_pct_of_girls\n",
      "ix:327\tlabel:Prostate_cancer_deaths_per_100_000_men\n",
      "ix:328\tlabel:Prostate_cancer_new_cases_per_100_000_men\n",
      "ix:329\tlabel:Prostate_cancer_number_of_male_deaths\n",
      "ix:330\tlabel:Prostate_cancer_number_of_new_male_cases\n",
      "ix:331\tlabel:Pump_price_for_gasoline\n",
      "ix:332\tlabel:Ratio_of_girls_to_boys_in_primary_and_secondary_education\n",
      "ix:333\tlabel:Ratio_of_young_literate_females_to_males\n",
      "ix:334\tlabel:Roads_paved\n",
      "ix:335\tlabel:SO2_emissions_per_person\n",
      "ix:336\tlabel:Services_contribution_to_economy\n",
      "ix:337\tlabel:Stomach_cancer_deaths_per_100_000_men\n",
      "ix:338\tlabel:Stomach_cancer_deaths_per_100_000_women\n",
      "ix:339\tlabel:Stomach_cancer_new_cases_per_100_000_men\n",
      "ix:340\tlabel:Stomach_cancer_new_cases_per_100_000_women\n",
      "ix:341\tlabel:Stomach_cancer_number_of_female_deaths\n",
      "ix:342\tlabel:Stomach_cancer_number_of_male_deaths\n",
      "ix:343\tlabel:Stomach_cancer_number_of_new_female_cases\n",
      "ix:344\tlabel:Stomach_cancer_number_of_new_male_cases\n",
      "ix:345\tlabel:Sugar_per_person\n",
      "ix:346\tlabel:Surface_area\n",
      "ix:347\tlabel:Tax_revenue\n",
      "ix:348\tlabel:Total_CO2_emissions\n",
      "ix:349\tlabel:Total_income\n",
      "ix:350\tlabel:Total_reserves\n",
      "ix:351\tlabel:Trade_balance_goods_and_services\n",
      "ix:352\tlabel:Under_five_mortality_from_CME\n",
      "ix:353\tlabel:Under_five_mortality_from_IHME\n",
      "ix:354\tlabel:Under_five_mortality_rate\n",
      "ix:355\tlabel:Urban_population\n",
      "ix:356\tlabel:Urban_population_growth\n",
      "ix:357\tlabel:Urban_population_pct_of_total\n"
     ]
    }
   ],
   "source": [
    "for ix, col in enumerate(df_who.columns): #label\n",
    "    print(\"ix:%i\\tlabel:%s\"%(ix,col))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Si inspeccionamos y comparamos los tipos del dataframe y de las columnas..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pandas.core.frame.DataFrame"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(df_who)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pandas.core.indexes.base.Index"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(df_who.columns)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "\n",
    "\n",
    "Podemos utilizar el nombre de una columna para obtener los datos de dicha columna, tal como lo hacíamos con un diccionario `python`. Veremos dos maneras diferentes de hacerlo:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0             Afghanistan\n",
       "1                 Albania\n",
       "2                 Algeria\n",
       "3                 Andorra\n",
       "4                  Angola\n",
       "              ...        \n",
       "197               Vietnam\n",
       "198    West Bank and Gaza\n",
       "199                 Yemen\n",
       "200                Zambia\n",
       "201              Zimbabwe\n",
       "Name: Country, Length: 202, dtype: object"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "paises = df_who[\"Country\"]\n",
    "paises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "**¿Qué tipo de datos es una columna?**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pandas.core.series.Series"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(df_who[\"Country\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Las `Series` són la otra estructura básica de Pandas. Las filas y las columnas se estructuran en `Series`, se pueden ver cómo un tipo de lista que solamente puede contener un único tipo de datos, acepta operaciones vectoriales y se puede indexar de manera similar a un diccionario."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Una vez que seleccionamos una columna, podemos acceder a sus elementos como si fueran una lista:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Afghanistan\n",
      "------------------------------\n",
      "0    Afghanistan\n",
      "1        Albania\n",
      "2        Algeria\n",
      "3        Andorra\n",
      "4         Angola\n",
      "Name: Country, dtype: object\n",
      "------------------------------\n",
      "['Afghanistan' 'Albania' 'Algeria' 'Andorra' 'Angola'\n",
      " 'Antigua and Barbuda' 'Argentina' 'Armenia' 'Australia' 'Austria'\n",
      " 'Azerbaijan' 'Bahamas' 'Bahrain' 'Bangladesh' 'Barbados' 'Belarus'\n",
      " 'Belgium' 'Belize' 'Benin' 'Bermuda' 'Bhutan' 'Bolivia'\n",
      " 'Bosnia and Herzegovina' 'Botswana' 'Brazil' 'Brunei Darussalam'\n",
      " 'Bulgaria' 'Burkina Faso' 'Burundi' 'Cambodia' 'Cameroon' 'Canada'\n",
      " 'Cape Verde' 'Central African Republic' 'Chad' 'Chile' 'China' 'Colombia'\n",
      " 'Comoros' 'Congo, Dem. Rep.' 'Congo, Rep.' 'Cook Islands' 'Costa Rica'\n",
      " \"Cote d'Ivoire\" 'Croatia' 'Cuba' 'Cyprus' 'Czech Republic' 'Denmark'\n",
      " 'Djibouti' 'Dominica' 'Dominican Republic' 'Ecuador' 'Egypt'\n",
      " 'El Salvador' 'Equatorial Guinea' 'Eritrea' 'Estonia' 'Ethiopia' 'Fiji'\n",
      " 'Finland' 'France' 'French Polynesia' 'Gabon' 'Gambia' 'Georgia'\n",
      " 'Germany' 'Ghana' 'Greece' 'Grenada' 'Guatemala' 'Guinea' 'Guinea-Bissau'\n",
      " 'Guyana' 'Haiti' 'Honduras' 'Hong Kong, China' 'Hungary' 'Iceland'\n",
      " 'India' 'Indonesia' 'Iran (Islamic Republic of)' 'Iraq' 'Ireland'\n",
      " 'Israel' 'Italy' 'Jamaica' 'Japan' 'Jordan' 'Kazakhstan' 'Kenya'\n",
      " 'Kiribati' 'Korea, Dem. Rep.' 'Korea, Rep.' 'Kuwait' 'Kyrgyzstan'\n",
      " \"Lao People's Democratic Republic\" 'Latvia' 'Lebanon' 'Lesotho' 'Liberia'\n",
      " 'Libyan Arab Jamahiriya' 'Lithuania' 'Luxembourg' 'Macao, China'\n",
      " 'Macedonia' 'Madagascar' 'Malawi' 'Malaysia' 'Maldives' 'Mali' 'Malta'\n",
      " 'Marshall Islands' 'Mauritania' 'Mauritius' 'Mexico'\n",
      " 'Micronesia (Federated States of)' 'Moldova' 'Monaco' 'Mongolia'\n",
      " 'Montenegro' 'Morocco' 'Mozambique' 'Myanmar' 'Namibia' 'Nauru' 'Nepal'\n",
      " 'Netherlands' 'Netherlands Antilles' 'New Caledonia' 'New Zealand'\n",
      " 'Nicaragua' 'Niger' 'Nigeria' 'Niue' 'Norway' 'Oman' 'Pakistan' 'Palau'\n",
      " 'Panama' 'Papua New Guinea' 'Paraguay' 'Peru' 'Philippines' 'Poland'\n",
      " 'Portugal' 'Puerto Rico' 'Qatar' 'Romania' 'Russia' 'Rwanda'\n",
      " 'Saint Kitts and Nevis' 'Saint Lucia' 'Saint Vincent and the Grenadines'\n",
      " 'Samoa' 'San Marino' 'Sao Tome and Principe' 'Saudi Arabia' 'Senegal'\n",
      " 'Serbia' 'Seychelles' 'Sierra Leone' 'Singapore' 'Slovakia' 'Slovenia'\n",
      " 'Solomon Islands' 'Somalia' 'South Africa' 'Spain' 'Sri Lanka' 'Sudan'\n",
      " 'Suriname' 'Swaziland' 'Sweden' 'Switzerland' 'Syria' 'Taiwan'\n",
      " 'Tajikistan' 'Tanzania' 'Thailand' 'Timor-Leste' 'Togo' 'Tonga'\n",
      " 'Trinidad and Tobago' 'Tunisia' 'Turkey' 'Turkmenistan' 'Tuvalu' 'Uganda'\n",
      " 'Ukraine' 'United Arab Emirates' 'United Kingdom'\n",
      " 'United States of America' 'Uruguay' 'Uzbekistan' 'Vanuatu' 'Venezuela'\n",
      " 'Vietnam' 'West Bank and Gaza' 'Yemen' 'Zambia' 'Zimbabwe']\n"
     ]
    }
   ],
   "source": [
    "print(df_who[\"Country\"][0])\n",
    "print(\"-\"*30)\n",
    "print(df_who[\"Country\"][:5]) # slicing\n",
    "print(\"-\"*30)\n",
    "print(df_who[\"Country\"].values)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Existe otra manera más sencilla de seleccionar una única columna, pero existen nombres de columna muy largos: 'Children aged <5 years who received any antimalarial treatment for fever (%)'.\n",
    "\n",
    "**Nota**: En la creación de documentos, es importante usar nombres adecuados para las columnas."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0             Afghanistan\n",
       "1                 Albania\n",
       "2                 Algeria\n",
       "3                 Andorra\n",
       "4                  Angola\n",
       "              ...        \n",
       "197               Vietnam\n",
       "198    West Bank and Gaza\n",
       "199                 Yemen\n",
       "200                Zambia\n",
       "201              Zimbabwe\n",
       "Name: Country, Length: 202, dtype: object"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who.Country"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Population annual growth rate (%)\n",
      "------------------------------\n",
      "0      4.0\n",
      "1      0.6\n",
      "2      1.5\n",
      "3      1.0\n",
      "4      2.8\n",
      "      ... \n",
      "197    1.4\n",
      "198    NaN\n",
      "199    3.0\n",
      "200    1.9\n",
      "201    0.8\n",
      "Name: Population annual growth rate (%), Length: 202, dtype: float64\n"
     ]
    }
   ],
   "source": [
    "print(df_who.columns[9])\n",
    "print(\"-\"*30)\n",
    "print(df_who[df_who.columns[9]])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>151.0</td>\n",
       "      <td>28.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>27.0</td>\n",
       "      <td>98.7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>6.0</td>\n",
       "      <td>69.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Andorra</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Angola</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>146.0</td>\n",
       "      <td>67.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>197</th>\n",
       "      <td>Vietnam</td>\n",
       "      <td>198</td>\n",
       "      <td>6</td>\n",
       "      <td>25.0</td>\n",
       "      <td>90.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>198</th>\n",
       "      <td>West Bank and Gaza</td>\n",
       "      <td>199</td>\n",
       "      <td>1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>199</th>\n",
       "      <td>Yemen</td>\n",
       "      <td>200</td>\n",
       "      <td>1</td>\n",
       "      <td>83.0</td>\n",
       "      <td>54.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>200</th>\n",
       "      <td>Zambia</td>\n",
       "      <td>201</td>\n",
       "      <td>3</td>\n",
       "      <td>161.0</td>\n",
       "      <td>68.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>201</th>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>202</td>\n",
       "      <td>3</td>\n",
       "      <td>101.0</td>\n",
       "      <td>89.5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>202 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "0           Afghanistan          1          1                          151.0   \n",
       "1               Albania          2          2                           27.0   \n",
       "2               Algeria          3          3                            6.0   \n",
       "3               Andorra          4          2                            NaN   \n",
       "4                Angola          5          3                          146.0   \n",
       "..                  ...        ...        ...                            ...   \n",
       "197             Vietnam        198          6                           25.0   \n",
       "198  West Bank and Gaza        199          1                            NaN   \n",
       "199               Yemen        200          1                           83.0   \n",
       "200              Zambia        201          3                          161.0   \n",
       "201            Zimbabwe        202          3                          101.0   \n",
       "\n",
       "     Adult literacy rate (%)  \n",
       "0                       28.0  \n",
       "1                       98.7  \n",
       "2                       69.9  \n",
       "3                        NaN  \n",
       "4                       67.4  \n",
       "..                       ...  \n",
       "197                     90.3  \n",
       "198                      NaN  \n",
       "199                     54.1  \n",
       "200                     68.0  \n",
       "201                     89.5  \n",
       "\n",
       "[202 rows x 5 columns]"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Multiples columnas\n",
    "df_who[df_who.columns[0:5]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>197</th>\n",
       "      <td>198</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>198</th>\n",
       "      <td>199</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>199</th>\n",
       "      <td>200</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>200</th>\n",
       "      <td>201</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>201</th>\n",
       "      <td>202</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>202 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     CountryID  Continent\n",
       "0            1          1\n",
       "1            2          2\n",
       "2            3          3\n",
       "3            4          2\n",
       "4            5          3\n",
       "..         ...        ...\n",
       "197        198          6\n",
       "198        199          1\n",
       "199        200          1\n",
       "200        201          3\n",
       "201        202          3\n",
       "\n",
       "[202 rows x 2 columns]"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Dos columnas específicas\n",
    "df_who[[\"CountryID\",\"Continent\"]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(202, 2)\n",
      "   Adolescent fertility rate (%)  Adult literacy rate (%)\n",
      "0                          151.0                     28.0\n",
      "1                           27.0                     98.7\n",
      "2                            6.0                     69.9\n",
      "3                            NaN                      NaN\n",
      "4                          146.0                     67.4\n"
     ]
    }
   ],
   "source": [
    "# Existen dos métodos loc e iloc para acceder a sub-regiones de los datos\n",
    "# funció: .iloc(filas, columnas)\n",
    "df2 = df_who.iloc[:,3:5]\n",
    "print(df2.shape)\n",
    "print(df2.head())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Columnas"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Cuando seleccionamos una columna de un DataFrame, obtenemos una Serie. Las Series tienen ciertas características, como la capacidad de aplicar métodos estadísticos (si son Series numéricas)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      151.0\n",
       "1       27.0\n",
       "2        6.0\n",
       "3        NaN\n",
       "4      146.0\n",
       "       ...  \n",
       "197     25.0\n",
       "198      NaN\n",
       "199     83.0\n",
       "200    161.0\n",
       "201    101.0\n",
       "Name: Adolescent fertility rate (%), Length: 202, dtype: float64"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fertilitat = df_who[df_who.columns[3]]\n",
    "\n",
    "fertilitat"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Min  0.0\n",
      "Max  199.0\n",
      "Count  177\n"
     ]
    }
   ],
   "source": [
    "print(\"Min \", fertilitat.min()) # a Pandas el concepte d'iterar \"no té sentit\"\n",
    "print(\"Max \", fertilitat.max())\n",
    "print(\"Count \", fertilitat.count())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Tabla con las funciones descriptivas<br/>\n",
    "<img src=\"https://i.imgur.com/OYnOFwL.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Veremos que obtener esta información estadística nos puede ayudar a extraer información muy concreta de la tabla, por ejemplo, si queremos saber:\n",
    "\n",
    "**¿Qué país tiene la mayor emisión de CO2?**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5776431.5"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "co2 = df_who[\"Total_CO2_emissions\"]\n",
    "co2.max()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'United States of America'"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who[co2==co2.max()][\"Country\"].values[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0         692.50\n",
       "1        3499.12\n",
       "2      137535.56\n",
       "3            NaN\n",
       "4        8991.46\n",
       "         ...    \n",
       "197    101826.23\n",
       "198       655.86\n",
       "199     20148.34\n",
       "200      2366.94\n",
       "201     11457.33\n",
       "Name: Total_CO2_emissions, Length: 202, dtype: float64"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "co2 = df_who[\"Total_CO2_emissions\"]\n",
    "co2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5776431.5"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "co2.max()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      False\n",
       "1      False\n",
       "2      False\n",
       "3      False\n",
       "4      False\n",
       "       ...  \n",
       "197    False\n",
       "198    False\n",
       "199    False\n",
       "200    False\n",
       "201    False\n",
       "Name: Total_CO2_emissions, Length: 202, dtype: bool"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "co2==co2.max()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>192</th>\n",
       "      <td>United States of America</td>\n",
       "      <td>193</td>\n",
       "      <td>4</td>\n",
       "      <td>43.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>44070.0</td>\n",
       "      <td>93.0</td>\n",
       "      <td>91.0</td>\n",
       "      <td>302841.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>5776431.5</td>\n",
       "      <td>1.100000e+13</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-7.140000e+11</td>\n",
       "      <td>8.0</td>\n",
       "      <td>7.1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>240000000.0</td>\n",
       "      <td>1.39</td>\n",
       "      <td>80.8</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                      Country  CountryID  Continent  \\\n",
       "192  United States of America        193          4   \n",
       "\n",
       "     Adolescent fertility rate (%)  Adult literacy rate (%)  \\\n",
       "192                           43.0                      NaN   \n",
       "\n",
       "     Gross national income per capita (PPP international $)  \\\n",
       "192                                            44070.0        \n",
       "\n",
       "     Net primary school enrolment ratio female (%)  \\\n",
       "192                                           93.0   \n",
       "\n",
       "     Net primary school enrolment ratio male (%)  \\\n",
       "192                                         91.0   \n",
       "\n",
       "     Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "192                         302841.0                                1.0  ...   \n",
       "\n",
       "     Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "192            5776431.5  1.100000e+13             NaN   \n",
       "\n",
       "     Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "192                     -7.140000e+11                            8.0   \n",
       "\n",
       "     Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "192                             7.1                        8.0   \n",
       "\n",
       "     Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "192       240000000.0                     1.39                           80.8  \n",
       "\n",
       "[1 rows x 358 columns]"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who[co2==co2.max()]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "192    United States of America\n",
       "Name: Country, dtype: object"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who[co2==co2.max()][\"Country\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['United States of America'], dtype=object)"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who[co2==co2.max()][\"Country\"].values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'United States of America'"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who[co2==co2.max()][\"Country\"].values[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "112     NaN\n",
      "24     71.0\n",
      "15     22.0\n",
      "Name: Adolescent fertility rate (%), dtype: float64\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "53      48.0\n",
       "149     28.0\n",
       "13     135.0\n",
       "Name: Adolescent fertility rate (%), dtype: float64"
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# También existen métodos para seleccionar de manera aleatoria muestras dentro de una serie\n",
    "# Tip: Métodos montecarlo\n",
    "fertilidad = df_who[df_who.columns[3]]\n",
    "some = fertilidad.sample(n=3)\n",
    "print(some)\n",
    "fertilidad.sample(n=3,random_state=2) #on random_state és la llavor/seed del aleatori"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### Filas\n",
    "\n",
    "Cada fila tiene un índice. El índice puede ser numérico, alfabético o temporal."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RangeIndex(start=0, stop=202, step=1)"
      ]
     },
     "execution_count": 65,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who.index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,\n",
       "        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,\n",
       "        26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,\n",
       "        39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,\n",
       "        52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,\n",
       "        65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,\n",
       "        78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,\n",
       "        91,  92,  93,  94,  95,  96,  97,  98,  99, 100, 101, 102, 103,\n",
       "       104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,\n",
       "       117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129,\n",
       "       130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,\n",
       "       143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155,\n",
       "       156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168,\n",
       "       169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181,\n",
       "       182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194,\n",
       "       195, 196, 197, 198, 199, 200, 201])"
      ]
     },
     "execution_count": 66,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who.index.values"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Así como seleccionamos columnas, podemos seleccionar información con para obtener filas. Para realizar la consulta de una fila concreta usaremos el atributo `loc` de los dataframes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "df = pd.read_csv(\"http://www.exploredata.net/ftp/WHO.csv\") #dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Country                           Afghanistan\n",
      "CountryID                                   1\n",
      "Continent                                   1\n",
      "Adolescent fertility rate (%)           151.0\n",
      "Adult literacy rate (%)                  28.0\n",
      "                                     ...     \n",
      "Under_five_mortality_from_IHME          231.9\n",
      "Under_five_mortality_rate               257.0\n",
      "Urban_population                    5740436.0\n",
      "Urban_population_growth                  5.44\n",
      "Urban_population_pct_of_total            22.9\n",
      "Name: 0, Length: 358, dtype: object\n"
     ]
    }
   ],
   "source": [
    "\n",
    "print(df.loc[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Si lo que necesitamos es obtener son los valores, necesitaremos el atributo `values`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['Afghanistan', 1, 1, 151.0, 28.0, nan, nan, nan, 26088.0, 4.0,\n",
       "       23.0, nan, 16.0, 4.0, 47.0, 6.0, 7.2, nan, nan, nan, 14.0, nan,\n",
       "       nan, nan, nan, nan, nan, 10.3, 73.0, 70.0, 83.0, 83.0, nan, 66.0,\n",
       "       90.0, nan, nan, nan, nan, nan, 20.1, 27.5, 4.4, 4.0, nan, nan,\n",
       "       900.0, nan, nan, 14930.0, nan, 900.0, 5970.0, 5.0, nan, 97.2, 8.0,\n",
       "       6.0, 29.0, 23.0, nan, 2.0, 72.5, 0.0, nan, 2.5, 0.0, 5.4, nan, nan,\n",
       "       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,\n",
       "       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,\n",
       "       nan, nan, nan, nan, nan, nan, nan, nan, 473.0, 443.0, 500.0, 153.0,\n",
       "       706.0, 134.0, 1269.0, 18.9, 0.3, 1.1, 1.0, 5.9, 26.0, 22.1, 24.8,\n",
       "       10.0, 32.0, 0.0, 36.0, 36.0, 35.0, 161.0, 165.0, 154.0, 176.0,\n",
       "       42.0, 43.0, 42.0, 1800.0, 60.0, 16.0, 100.0, 231.0, 257.0, 254.0,\n",
       "       260.0, 76.0, 6.0, 18.0, 4.6, 59.3, 32.9, nan, 0.01, nan, nan, 17.0,\n",
       "       22.0, 37.0, 25.0, 30.0, 45.0, nan, nan, nan, nan, 9.8, 3.2, 13.1,\n",
       "       nan, nan, nan, nan, 58.35, 36.1, nan, 108.83, 2750000000.0, 168.0,\n",
       "       87.0, 42.29, 0.0, 28000000.0, 2.9, 14.3, 11.7, 26.8, 874.0, 2021.0,\n",
       "       220.0, 0.000878, 0.02, 0.04, 62300000000.0, 4.0, 600000.0, nan,\n",
       "       3.6, 6.9, 254.0, 511.0, 96.8, nan, nan, 1792633.0, 7.07, nan, nan,\n",
       "       nan, nan, 3.3, 2.8, 5.2, 4.5, 193.0, 236.0, 316.0, 377.0, nan,\n",
       "       10.3, 33.0, nan, -7.0, nan, nan, nan, nan, nan, nan, nan, 12.43,\n",
       "       nan, nan, nan, nan, 5.19, 0.0, nan, 8670.0, 25.05, nan, 20.0, 4.16,\n",
       "       1.04, 3.3, 20.0, 5.2, nan, nan, nan, 55.66, nan, 49.0, 39.0, nan,\n",
       "       874.0, nan, 24.48, nan, 165.0, 76.0, 40.0, 90.0, 11.9, 1.0, 43.4,\n",
       "       12.59, 43.14, 28.0, 18.39, 50.81, 34.26, 3.5, 2.3, 3.7, 2.5, 147.0,\n",
       "       218.0, 155.0, 233.0, 11.3, 2.7, 12.2, 2.9, 173.0, 675.0, 190.0,\n",
       "       732.0, nan, nan, nan, nan, 1900.0, nan, nan, 64.0, 0.19, 39.42,\n",
       "       9.93, nan, nan, nan, nan, nan, nan, nan, nan, nan, 8242.0, 66826.0,\n",
       "       nan, nan, nan, nan, nan, nan, 717.04, nan, nan, nan, nan, nan, nan,\n",
       "       nan, nan, 29900000.0, nan, nan, 37.73, nan, nan, nan, nan, 2.8,\n",
       "       4.5, 151.0, 249.0, 0.68, 55.57, 36.2, 23.66, 3.14, 39.42, 15.8,\n",
       "       8.3, 18.5, 9.7, 499.0, 936.0, 592.0, 1108.0, nan, 652090.0, nan,\n",
       "       692.5, nan, nan, nan, 257.0, 231.9, 257.0, 5740436.0, 5.44, 22.9],\n",
       "      dtype=object)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[0].values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Afghanistan\n",
      "Afghanistan\n",
      "151.0\n"
     ]
    }
   ],
   "source": [
    "print(df_who.loc[0].Country) \n",
    "print(df_who.loc[0][0])\n",
    "print(df_who.loc[0][3])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "#### *Slicing*\n",
    "\n",
    "Utilizando el atributo `loc` del dataframe podemos seleccionar y filtrar las filas (y columnas) mediante labels. En el caso de filas, el label es el índice y si éste es númerico podemos usar  los _slicing_ típicos de `Python`.\n",
    "\n",
    "```\n",
    ".loc(row_labels,columns_labels)\n",
    "```\n",
    "\n",
    "Recordemos el _slicing_:\n",
    "```{python}\n",
    "sublista = lista[start:stop:step]\n",
    "```\n",
    "\n",
    "Dónde:\n",
    "* **start**: Posición de la lista original dónde empieza la sublista. Si no se indica és 0.\n",
    "* **stop**:  Posición de la lista original hasta donde seleccionar. Se selecciona hasta la posición stop - 1.\n",
    "* **step**:  Incremento entre cada índice de la selección, por defecto 1."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Si entendemos el concepto para un array..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 2]\n",
      "[4, 5, 6, 7, 9, 0]\n",
      "[1, 2, 3]\n"
     ]
    }
   ],
   "source": [
    "array =[1,2,3,4,5,6,7,9,0]\n",
    "print(array[0:2]) #**\n",
    "print(array[3:]) #**\n",
    "print(array[:3]) #**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "... podemos hacer las mismas operaciones con las filas de un _dataFrame_."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Angola</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>146.0</td>\n",
       "      <td>67.4</td>\n",
       "      <td>3890.0</td>\n",
       "      <td>49.0</td>\n",
       "      <td>51.0</td>\n",
       "      <td>16557.0</td>\n",
       "      <td>2.8</td>\n",
       "      <td>...</td>\n",
       "      <td>8991.46</td>\n",
       "      <td>1.490000e+10</td>\n",
       "      <td>27.13</td>\n",
       "      <td>9.140000e+09</td>\n",
       "      <td>164.1</td>\n",
       "      <td>242.5</td>\n",
       "      <td>164.1</td>\n",
       "      <td>8578749.0</td>\n",
       "      <td>4.14</td>\n",
       "      <td>53.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Argentina</td>\n",
       "      <td>7</td>\n",
       "      <td>5</td>\n",
       "      <td>62.0</td>\n",
       "      <td>97.2</td>\n",
       "      <td>11670.0</td>\n",
       "      <td>98.0</td>\n",
       "      <td>99.0</td>\n",
       "      <td>39134.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>152711.86</td>\n",
       "      <td>3.140000e+11</td>\n",
       "      <td>21.11</td>\n",
       "      <td>1.190000e+10</td>\n",
       "      <td>18.1</td>\n",
       "      <td>16.7</td>\n",
       "      <td>18.1</td>\n",
       "      <td>34900000.0</td>\n",
       "      <td>1.17</td>\n",
       "      <td>90.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Australia</td>\n",
       "      <td>9</td>\n",
       "      <td>6</td>\n",
       "      <td>16.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>33940.0</td>\n",
       "      <td>97.0</td>\n",
       "      <td>96.0</td>\n",
       "      <td>20530.0</td>\n",
       "      <td>1.1</td>\n",
       "      <td>...</td>\n",
       "      <td>368858.53</td>\n",
       "      <td>4.680000e+11</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.280000e+10</td>\n",
       "      <td>5.9</td>\n",
       "      <td>5.1</td>\n",
       "      <td>5.9</td>\n",
       "      <td>18000000.0</td>\n",
       "      <td>1.54</td>\n",
       "      <td>88.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Azerbaijan</td>\n",
       "      <td>11</td>\n",
       "      <td>2</td>\n",
       "      <td>31.0</td>\n",
       "      <td>98.8</td>\n",
       "      <td>5430.0</td>\n",
       "      <td>83.0</td>\n",
       "      <td>86.0</td>\n",
       "      <td>8406.0</td>\n",
       "      <td>0.6</td>\n",
       "      <td>...</td>\n",
       "      <td>36629.01</td>\n",
       "      <td>9.930000e+09</td>\n",
       "      <td>64.89</td>\n",
       "      <td>1.330000e+09</td>\n",
       "      <td>50.0</td>\n",
       "      <td>63.7</td>\n",
       "      <td>50.0</td>\n",
       "      <td>4321803.0</td>\n",
       "      <td>1.26</td>\n",
       "      <td>51.5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>4 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "4       Angola          5          3                          146.0   \n",
       "6    Argentina          7          5                           62.0   \n",
       "8    Australia          9          6                           16.0   \n",
       "10  Azerbaijan         11          2                           31.0   \n",
       "\n",
       "    Adult literacy rate (%)  \\\n",
       "4                      67.4   \n",
       "6                      97.2   \n",
       "8                       NaN   \n",
       "10                     98.8   \n",
       "\n",
       "    Gross national income per capita (PPP international $)  \\\n",
       "4                                              3890.0        \n",
       "6                                             11670.0        \n",
       "8                                             33940.0        \n",
       "10                                             5430.0        \n",
       "\n",
       "    Net primary school enrolment ratio female (%)  \\\n",
       "4                                            49.0   \n",
       "6                                            98.0   \n",
       "8                                            97.0   \n",
       "10                                           83.0   \n",
       "\n",
       "    Net primary school enrolment ratio male (%)  \\\n",
       "4                                          51.0   \n",
       "6                                          99.0   \n",
       "8                                          96.0   \n",
       "10                                         86.0   \n",
       "\n",
       "    Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "4                           16557.0                                2.8  ...   \n",
       "6                           39134.0                                1.0  ...   \n",
       "8                           20530.0                                1.1  ...   \n",
       "10                           8406.0                                0.6  ...   \n",
       "\n",
       "    Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "4               8991.46  1.490000e+10           27.13   \n",
       "6             152711.86  3.140000e+11           21.11   \n",
       "8             368858.53  4.680000e+11             NaN   \n",
       "10             36629.01  9.930000e+09           64.89   \n",
       "\n",
       "    Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "4                       9.140000e+09                          164.1   \n",
       "6                       1.190000e+10                           18.1   \n",
       "8                      -1.280000e+10                            5.9   \n",
       "10                      1.330000e+09                           50.0   \n",
       "\n",
       "    Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "4                            242.5                      164.1   \n",
       "6                             16.7                       18.1   \n",
       "8                              5.1                        5.9   \n",
       "10                            63.7                       50.0   \n",
       "\n",
       "    Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "4          8578749.0                     4.14                           53.3  \n",
       "6         34900000.0                     1.17                           90.1  \n",
       "8         18000000.0                     1.54                           88.2  \n",
       "10         4321803.0                     1.26                           51.5  \n",
       "\n",
       "[4 rows x 358 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[4:10:2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "También se puede realizar una selección particular mediante una lista:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Andorra</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>83.0</td>\n",
       "      <td>83.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Azerbaijan</td>\n",
       "      <td>11</td>\n",
       "      <td>2</td>\n",
       "      <td>31.0</td>\n",
       "      <td>98.8</td>\n",
       "      <td>5430.0</td>\n",
       "      <td>83.0</td>\n",
       "      <td>86.0</td>\n",
       "      <td>8406.0</td>\n",
       "      <td>0.6</td>\n",
       "      <td>...</td>\n",
       "      <td>36629.01</td>\n",
       "      <td>9.930000e+09</td>\n",
       "      <td>64.89</td>\n",
       "      <td>1.330000e+09</td>\n",
       "      <td>50.0</td>\n",
       "      <td>63.7</td>\n",
       "      <td>50.0</td>\n",
       "      <td>4321803.0</td>\n",
       "      <td>1.26</td>\n",
       "      <td>51.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>Cambodia</td>\n",
       "      <td>30</td>\n",
       "      <td>7</td>\n",
       "      <td>52.0</td>\n",
       "      <td>73.6</td>\n",
       "      <td>1550.0</td>\n",
       "      <td>89.0</td>\n",
       "      <td>91.0</td>\n",
       "      <td>14197.0</td>\n",
       "      <td>1.7</td>\n",
       "      <td>...</td>\n",
       "      <td>538.61</td>\n",
       "      <td>5.680000e+09</td>\n",
       "      <td>32.94</td>\n",
       "      <td>-5.470000e+08</td>\n",
       "      <td>97.3</td>\n",
       "      <td>94.3</td>\n",
       "      <td>97.3</td>\n",
       "      <td>2749235.0</td>\n",
       "      <td>4.58</td>\n",
       "      <td>19.7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>Chad</td>\n",
       "      <td>35</td>\n",
       "      <td>3</td>\n",
       "      <td>193.0</td>\n",
       "      <td>25.7</td>\n",
       "      <td>1170.0</td>\n",
       "      <td>49.0</td>\n",
       "      <td>71.0</td>\n",
       "      <td>10468.0</td>\n",
       "      <td>3.1</td>\n",
       "      <td>...</td>\n",
       "      <td>139.23</td>\n",
       "      <td>2.790000e+09</td>\n",
       "      <td>14.17</td>\n",
       "      <td>-2.210000e+08</td>\n",
       "      <td>208.2</td>\n",
       "      <td>180.1</td>\n",
       "      <td>208.2</td>\n",
       "      <td>2566839.0</td>\n",
       "      <td>4.88</td>\n",
       "      <td>25.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>4 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "3      Andorra          4          2                            NaN   \n",
       "10  Azerbaijan         11          2                           31.0   \n",
       "29    Cambodia         30          7                           52.0   \n",
       "34        Chad         35          3                          193.0   \n",
       "\n",
       "    Adult literacy rate (%)  \\\n",
       "3                       NaN   \n",
       "10                     98.8   \n",
       "29                     73.6   \n",
       "34                     25.7   \n",
       "\n",
       "    Gross national income per capita (PPP international $)  \\\n",
       "3                                                 NaN        \n",
       "10                                             5430.0        \n",
       "29                                             1550.0        \n",
       "34                                             1170.0        \n",
       "\n",
       "    Net primary school enrolment ratio female (%)  \\\n",
       "3                                            83.0   \n",
       "10                                           83.0   \n",
       "29                                           89.0   \n",
       "34                                           49.0   \n",
       "\n",
       "    Net primary school enrolment ratio male (%)  \\\n",
       "3                                          83.0   \n",
       "10                                         86.0   \n",
       "29                                         91.0   \n",
       "34                                         71.0   \n",
       "\n",
       "    Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "3                              74.0                                1.0  ...   \n",
       "10                           8406.0                                0.6  ...   \n",
       "29                          14197.0                                1.7  ...   \n",
       "34                          10468.0                                3.1  ...   \n",
       "\n",
       "    Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "3                   NaN           NaN             NaN   \n",
       "10             36629.01  9.930000e+09           64.89   \n",
       "29               538.61  5.680000e+09           32.94   \n",
       "34               139.23  2.790000e+09           14.17   \n",
       "\n",
       "    Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "3                                NaN                            NaN   \n",
       "10                      1.330000e+09                           50.0   \n",
       "29                     -5.470000e+08                           97.3   \n",
       "34                     -2.210000e+08                          208.2   \n",
       "\n",
       "    Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "3                              NaN                        NaN   \n",
       "10                            63.7                       50.0   \n",
       "29                            94.3                       97.3   \n",
       "34                           180.1                      208.2   \n",
       "\n",
       "    Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "3                NaN                      NaN                            NaN  \n",
       "10         4321803.0                     1.26                           51.5  \n",
       "29         2749235.0                     4.58                           19.7  \n",
       "34         2566839.0                     4.88                           25.3  \n",
       "\n",
       "[4 rows x 358 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[[3,10,29,34]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### Selección de filas y columnas\n",
    "\n",
    "Si seguimos con la misma lógica, usando el atributo `loc` de los _dataFrames_."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "df = pd.read_csv(\"http://www.exploredata.net/ftp/WHO.csv\") #dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>151.0</td>\n",
       "      <td>28.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>26088.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>...</td>\n",
       "      <td>692.50</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>257.00</td>\n",
       "      <td>231.9</td>\n",
       "      <td>257.00</td>\n",
       "      <td>5740436.0</td>\n",
       "      <td>5.44</td>\n",
       "      <td>22.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>27.0</td>\n",
       "      <td>98.7</td>\n",
       "      <td>6000.0</td>\n",
       "      <td>93.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>3172.0</td>\n",
       "      <td>0.6</td>\n",
       "      <td>...</td>\n",
       "      <td>3499.12</td>\n",
       "      <td>4.790000e+09</td>\n",
       "      <td>78.14</td>\n",
       "      <td>-2.040000e+09</td>\n",
       "      <td>18.47</td>\n",
       "      <td>15.5</td>\n",
       "      <td>18.47</td>\n",
       "      <td>1431793.9</td>\n",
       "      <td>2.21</td>\n",
       "      <td>45.4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>2 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "0  Afghanistan          1          1                          151.0   \n",
       "1      Albania          2          2                           27.0   \n",
       "\n",
       "   Adult literacy rate (%)  \\\n",
       "0                     28.0   \n",
       "1                     98.7   \n",
       "\n",
       "   Gross national income per capita (PPP international $)  \\\n",
       "0                                                NaN        \n",
       "1                                             6000.0        \n",
       "\n",
       "   Net primary school enrolment ratio female (%)  \\\n",
       "0                                            NaN   \n",
       "1                                           93.0   \n",
       "\n",
       "   Net primary school enrolment ratio male (%)  \\\n",
       "0                                          NaN   \n",
       "1                                         94.0   \n",
       "\n",
       "   Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "0                          26088.0                                4.0  ...   \n",
       "1                           3172.0                                0.6  ...   \n",
       "\n",
       "   Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "0               692.50           NaN             NaN   \n",
       "1              3499.12  4.790000e+09           78.14   \n",
       "\n",
       "   Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "0                               NaN                         257.00   \n",
       "1                     -2.040000e+09                          18.47   \n",
       "\n",
       "   Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "0                           231.9                     257.00   \n",
       "1                            15.5                      18.47   \n",
       "\n",
       "   Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "0         5740436.0                     5.44                           22.9  \n",
       "1         1431793.9                     2.21                           45.4  \n",
       "\n",
       "[2 rows x 358 columns]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[0:1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Las columnas se deben seleccionar con una **lista** que debe contener el nombre de las columnas deseadas."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Continent</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Continent\n",
       "0          1\n",
       "1          2"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[0:1,[\"Continent\"]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Continent</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>692.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>3499.12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>137535.56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Continent  Total_CO2_emissions\n",
       "0          1               692.50\n",
       "1          2              3499.12\n",
       "2          3            137535.56\n",
       "3          2                  NaN"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[0:3,[\"Continent\",\"Total_CO2_emissions\"]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>1</td>\n",
       "      <td>692.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Andorra</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>Bhutan</td>\n",
       "      <td>7</td>\n",
       "      <td>414.03</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100</th>\n",
       "      <td>Liberia</td>\n",
       "      <td>3</td>\n",
       "      <td>472.66</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         Country  Continent  Total_CO2_emissions\n",
       "0    Afghanistan          1               692.50\n",
       "3        Andorra          2                  NaN\n",
       "20        Bhutan          7               414.03\n",
       "100      Liberia          3               472.66"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[[0,3,20,100],[\"Country\",\"Continent\",\"Total_CO2_emissions\"]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Alternativamente, con el atributo _iloc_ podemos seleccionar las columnas con su índice numérico: su posicion en la lista de columnas."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Country                           Afghanistan\n",
       "CountryID                                   1\n",
       "Continent                                   1\n",
       "Adolescent fertility rate (%)           151.0\n",
       "Adult literacy rate (%)                  28.0\n",
       "                                     ...     \n",
       "Under_five_mortality_from_IHME          231.9\n",
       "Under_five_mortality_rate               257.0\n",
       "Urban_population                    5740436.0\n",
       "Urban_population_growth                  5.44\n",
       "Urban_population_pct_of_total            22.9\n",
       "Name: 0, Length: 358, dtype: object"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>151.0</td>\n",
       "      <td>28.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>27.0</td>\n",
       "      <td>98.7</td>\n",
       "      <td>6000.0</td>\n",
       "      <td>93.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>6.0</td>\n",
       "      <td>69.9</td>\n",
       "      <td>5940.0</td>\n",
       "      <td>94.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>83.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Adolescent fertility rate (%)  Adult literacy rate (%)  \\\n",
       "0                          151.0                     28.0   \n",
       "1                           27.0                     98.7   \n",
       "2                            6.0                     69.9   \n",
       "3                            NaN                      NaN   \n",
       "\n",
       "   Gross national income per capita (PPP international $)  \\\n",
       "0                                                NaN        \n",
       "1                                             6000.0        \n",
       "2                                             5940.0        \n",
       "3                                                NaN        \n",
       "\n",
       "   Net primary school enrolment ratio female (%)  \n",
       "0                                            NaN  \n",
       "1                                           93.0  \n",
       "2                                           94.0  \n",
       "3                                           83.0  "
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[0:4, 3:7] # ídem a una matriz"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Afghanistan'"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[0][0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Country                          Afghanistan\n",
       "CountryID                                  1\n",
       "Continent                                  1\n",
       "Adolescent fertility rate (%)          151.0\n",
       "Name: 0, dtype: object"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[0][0:4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['Afghanistan', 1, 1, 151.0], dtype=object)"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[0][0:4].values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Afghanistan'"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[0][0:4].values[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### Selección condicional\n",
    "\n",
    "Además de la selección con base a índices, lo interesante es realizar selecciones mediante condiciones lógicas que permiten filtrar las filas del dataset. Por ejemplo:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>85</th>\n",
       "      <td>Italy</td>\n",
       "      <td>98.4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Country  Adult literacy rate (%)\n",
       "85   Italy                     98.4"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who = pd.read_csv(\"http://www.exploredata.net/ftp/WHO.csv\") #dataframe\n",
    "\n",
    "alfabetitzacio = df_who[df_who['Adult literacy rate (%)'] > 70][[\"Country\",\"Adult literacy rate (%)\"]]\n",
    "\n",
    "\n",
    "alfabetitzacio[alfabetitzacio[\"Country\"] == \"Italy\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      False\n",
       "1       True\n",
       "2      False\n",
       "3      False\n",
       "4      False\n",
       "       ...  \n",
       "197     True\n",
       "198    False\n",
       "199    False\n",
       "200    False\n",
       "201     True\n",
       "Name: Adult literacy rate (%), Length: 202, dtype: bool"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who['Adult literacy rate (%)'] > 70"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "seleccio = df_who['Adult literacy rate (%)'] > 70"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>98.7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Argentina</td>\n",
       "      <td>97.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Armenia</td>\n",
       "      <td>99.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Azerbaijan</td>\n",
       "      <td>98.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Bahrain</td>\n",
       "      <td>86.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>193</th>\n",
       "      <td>Uruguay</td>\n",
       "      <td>96.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>195</th>\n",
       "      <td>Vanuatu</td>\n",
       "      <td>75.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>196</th>\n",
       "      <td>Venezuela</td>\n",
       "      <td>93.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>197</th>\n",
       "      <td>Vietnam</td>\n",
       "      <td>90.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>201</th>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>89.5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>93 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        Country  Adult literacy rate (%)\n",
       "1       Albania                     98.7\n",
       "6     Argentina                     97.2\n",
       "7       Armenia                     99.4\n",
       "10   Azerbaijan                     98.8\n",
       "12      Bahrain                     86.5\n",
       "..          ...                      ...\n",
       "193     Uruguay                     96.8\n",
       "195     Vanuatu                     75.5\n",
       "196   Venezuela                     93.0\n",
       "197     Vietnam                     90.3\n",
       "201    Zimbabwe                     89.5\n",
       "\n",
       "[93 rows x 2 columns]"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# En aquest codi, de les files on seleccio == True agafam les dues columnes que ens interessen\n",
    "df_who[seleccio][[\"Country\",\"Adult literacy rate (%)\"]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([92]),)"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Multiples criterios\n",
    "# CO_emisions y Fertilidad\n",
    "import numpy as np\n",
    "ix = np.where((df_who[\"Total_CO2_emissions\"] > 10) & (df_who[df_who.columns[3]] <=0.6))\n",
    "ix"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Estadísticas en un DataFrame"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>one</th>\n",
       "      <th>two</th>\n",
       "      <th>three</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>-1</td>\n",
       "      <td>0.748804</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>-6</td>\n",
       "      <td>0.498507</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>0.224797</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>-10</td>\n",
       "      <td>0.198063</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>7</td>\n",
       "      <td>0.760531</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   one       two  three\n",
       "0   -1  0.748804      0\n",
       "1   -6  0.498507      2\n",
       "2    5  0.224797      0\n",
       "3  -10  0.198063      4\n",
       "4    7  0.760531      3"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Creamos un dataframe  con valores aleatorios\n",
    "import numpy as np\n",
    "\n",
    "np.random.seed(10)\n",
    "\n",
    "df = pd.DataFrame({\"one\":np.random.randint(-10,10,5),\n",
    "                  \"two\":np.random.random(5),\n",
    "                  \"three\":np.random.randint(0,5,5)})\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>one</th>\n",
       "      <td>-1.000000</td>\n",
       "      <td>-6.000000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>-10.000000</td>\n",
       "      <td>7.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>two</th>\n",
       "      <td>0.748804</td>\n",
       "      <td>0.498507</td>\n",
       "      <td>0.224797</td>\n",
       "      <td>0.198063</td>\n",
       "      <td>0.760531</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>three</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>4.000000</td>\n",
       "      <td>3.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              0         1         2          3         4\n",
       "one   -1.000000 -6.000000  5.000000 -10.000000  7.000000\n",
       "two    0.748804  0.498507  0.224797   0.198063  0.760531\n",
       "three  0.000000  2.000000  0.000000   4.000000  3.000000"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "one     -5.000000\n",
       "two      2.430701\n",
       "three    9.000000\n",
       "dtype: float64"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    -0.251196\n",
       "1    -3.501493\n",
       "2     5.224797\n",
       "3    -5.801937\n",
       "4    10.760531\n",
       "dtype: float64"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.sum(axis=1) # concepto de axis "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>one</th>\n",
       "      <th>two</th>\n",
       "      <th>three</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>-1</td>\n",
       "      <td>0.748804</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>-7</td>\n",
       "      <td>1.247311</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>-2</td>\n",
       "      <td>1.472108</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>-12</td>\n",
       "      <td>1.670170</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>-5</td>\n",
       "      <td>2.430701</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   one       two  three\n",
       "0   -1  0.748804      0\n",
       "1   -7  1.247311      2\n",
       "2   -2  1.472108      2\n",
       "3  -12  1.670170      6\n",
       "4   -5  2.430701      9"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.cumsum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>one</th>\n",
       "      <th>two</th>\n",
       "      <th>three</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>-1.0</td>\n",
       "      <td>-0.251196</td>\n",
       "      <td>-0.251196</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>-6.0</td>\n",
       "      <td>-5.501493</td>\n",
       "      <td>-3.501493</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5.0</td>\n",
       "      <td>5.224797</td>\n",
       "      <td>5.224797</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>-10.0</td>\n",
       "      <td>-9.801937</td>\n",
       "      <td>-5.801937</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>7.0</td>\n",
       "      <td>7.760531</td>\n",
       "      <td>10.760531</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    one       two      three\n",
       "0  -1.0 -0.251196  -0.251196\n",
       "1  -6.0 -5.501493  -3.501493\n",
       "2   5.0  5.224797   5.224797\n",
       "3 -10.0 -9.801937  -5.801937\n",
       "4   7.0  7.760531  10.760531"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.cumsum(axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>one</th>\n",
       "      <th>two</th>\n",
       "      <th>three</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0.748804</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>7</td>\n",
       "      <td>1.247311</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>12</td>\n",
       "      <td>1.472108</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>22</td>\n",
       "      <td>1.670170</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>29</td>\n",
       "      <td>2.430701</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   one       two  three\n",
       "0    1  0.748804      0\n",
       "1    7  1.247311      2\n",
       "2   12  1.472108      2\n",
       "3   22  1.670170      6\n",
       "4   29  2.430701      9"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.apply(np.abs,axis=1).cumsum() # función apply"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>one</th>\n",
       "      <th>two</th>\n",
       "      <th>three</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1.0</td>\n",
       "      <td>1.748804</td>\n",
       "      <td>1.748804</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>6.0</td>\n",
       "      <td>6.498507</td>\n",
       "      <td>8.498507</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5.0</td>\n",
       "      <td>5.224797</td>\n",
       "      <td>5.224797</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>10.0</td>\n",
       "      <td>10.198063</td>\n",
       "      <td>14.198063</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>7.0</td>\n",
       "      <td>7.760531</td>\n",
       "      <td>10.760531</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    one        two      three\n",
       "0   1.0   1.748804   1.748804\n",
       "1   6.0   6.498507   8.498507\n",
       "2   5.0   5.224797   5.224797\n",
       "3  10.0  10.198063  14.198063\n",
       "4   7.0   7.760531  10.760531"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.apply(np.abs).cumsum(axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>one</th>\n",
       "      <th>two</th>\n",
       "      <th>three</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>-1</td>\n",
       "      <td>0.748804</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>6</td>\n",
       "      <td>0.373284</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>30</td>\n",
       "      <td>0.083913</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>-300</td>\n",
       "      <td>0.016620</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>-2100</td>\n",
       "      <td>0.012640</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    one       two  three\n",
       "0    -1  0.748804      0\n",
       "1     6  0.373284      0\n",
       "2    30  0.083913      0\n",
       "3  -300  0.016620      0\n",
       "4 -2100  0.012640      0"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.cumprod()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>one</th>\n",
       "      <th>two</th>\n",
       "      <th>three</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>-1.0</td>\n",
       "      <td>-0.748804</td>\n",
       "      <td>-0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>-6.0</td>\n",
       "      <td>-2.991042</td>\n",
       "      <td>-5.982084</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5.0</td>\n",
       "      <td>1.123983</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>-10.0</td>\n",
       "      <td>-1.980629</td>\n",
       "      <td>-7.922515</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>7.0</td>\n",
       "      <td>5.323715</td>\n",
       "      <td>15.971145</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    one       two      three\n",
       "0  -1.0 -0.748804  -0.000000\n",
       "1  -6.0 -2.991042  -5.982084\n",
       "2   5.0  1.123983   0.000000\n",
       "3 -10.0 -1.980629  -7.922515\n",
       "4   7.0  5.323715  15.971145"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.cumprod(axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1. 1. 1. 1. 1.]\n"
     ]
    }
   ],
   "source": [
    "ones = np.ones(5) # numpy arrays\n",
    "print(ones)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>one</th>\n",
       "      <th>two</th>\n",
       "      <th>three</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>-2.0</td>\n",
       "      <td>-0.251196</td>\n",
       "      <td>-1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>-7.0</td>\n",
       "      <td>-0.501493</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>4.0</td>\n",
       "      <td>-0.775203</td>\n",
       "      <td>-1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>-11.0</td>\n",
       "      <td>-0.801937</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>6.0</td>\n",
       "      <td>-0.239469</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    one       two  three\n",
       "0  -2.0 -0.251196   -1.0\n",
       "1  -7.0 -0.501493    1.0\n",
       "2   4.0 -0.775203   -1.0\n",
       "3 -11.0 -0.801937    3.0\n",
       "4   6.0 -0.239469    2.0"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.sub(ones,axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "Unable to coerce to Series, length must be 3: given 5",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32m/Users/isaac/Projects/TxADM_notebooks/notebooks/Part2/00_Pandas/01_Introduccion.ipynb Cell 131\u001b[0m line \u001b[0;36m<cell line: 1>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> <a href='vscode-notebook-cell:/Users/isaac/Projects/TxADM_notebooks/notebooks/Part2/00_Pandas/01_Introduccion.ipynb#Y241sZmlsZQ%3D%3D?line=0'>1</a>\u001b[0m df\u001b[39m.\u001b[39;49msub(ones)\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/core/ops/__init__.py:436\u001b[0m, in \u001b[0;36mflex_arith_method_FRAME.<locals>.f\u001b[0;34m(self, other, axis, level, fill_value)\u001b[0m\n\u001b[1;32m    433\u001b[0m axis \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_get_axis_number(axis) \u001b[39mif\u001b[39;00m axis \u001b[39mis\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mNone\u001b[39;00m \u001b[39melse\u001b[39;00m \u001b[39m1\u001b[39m\n\u001b[1;32m    435\u001b[0m other \u001b[39m=\u001b[39m maybe_prepare_scalar_for_op(other, \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mshape)\n\u001b[0;32m--> 436\u001b[0m \u001b[39mself\u001b[39m, other \u001b[39m=\u001b[39m align_method_FRAME(\u001b[39mself\u001b[39;49m, other, axis, flex\u001b[39m=\u001b[39;49m\u001b[39mTrue\u001b[39;49;00m, level\u001b[39m=\u001b[39;49mlevel)\n\u001b[1;32m    438\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39misinstance\u001b[39m(other, ABCDataFrame):\n\u001b[1;32m    439\u001b[0m     \u001b[39m# Another DataFrame\u001b[39;00m\n\u001b[1;32m    440\u001b[0m     new_data \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_combine_frame(other, na_op, fill_value)\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/core/ops/__init__.py:248\u001b[0m, in \u001b[0;36malign_method_FRAME\u001b[0;34m(left, right, axis, flex, level)\u001b[0m\n\u001b[1;32m    245\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39misinstance\u001b[39m(right, np\u001b[39m.\u001b[39mndarray):\n\u001b[1;32m    247\u001b[0m     \u001b[39mif\u001b[39;00m right\u001b[39m.\u001b[39mndim \u001b[39m==\u001b[39m \u001b[39m1\u001b[39m:\n\u001b[0;32m--> 248\u001b[0m         right \u001b[39m=\u001b[39m to_series(right)\n\u001b[1;32m    250\u001b[0m     \u001b[39melif\u001b[39;00m right\u001b[39m.\u001b[39mndim \u001b[39m==\u001b[39m \u001b[39m2\u001b[39m:\n\u001b[1;32m    251\u001b[0m         \u001b[39mif\u001b[39;00m right\u001b[39m.\u001b[39mshape \u001b[39m==\u001b[39m left\u001b[39m.\u001b[39mshape:\n",
      "File \u001b[0;32m~/.pyenv/versions/3.9.7/lib/python3.9/site-packages/pandas/core/ops/__init__.py:239\u001b[0m, in \u001b[0;36malign_method_FRAME.<locals>.to_series\u001b[0;34m(right)\u001b[0m\n\u001b[1;32m    237\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[1;32m    238\u001b[0m     \u001b[39mif\u001b[39;00m \u001b[39mlen\u001b[39m(left\u001b[39m.\u001b[39mcolumns) \u001b[39m!=\u001b[39m \u001b[39mlen\u001b[39m(right):\n\u001b[0;32m--> 239\u001b[0m         \u001b[39mraise\u001b[39;00m \u001b[39mValueError\u001b[39;00m(\n\u001b[1;32m    240\u001b[0m             msg\u001b[39m.\u001b[39mformat(req_len\u001b[39m=\u001b[39m\u001b[39mlen\u001b[39m(left\u001b[39m.\u001b[39mcolumns), given_len\u001b[39m=\u001b[39m\u001b[39mlen\u001b[39m(right))\n\u001b[1;32m    241\u001b[0m         )\n\u001b[1;32m    242\u001b[0m     right \u001b[39m=\u001b[39m left\u001b[39m.\u001b[39m_constructor_sliced(right, index\u001b[39m=\u001b[39mleft\u001b[39m.\u001b[39mcolumns)\n\u001b[1;32m    243\u001b[0m \u001b[39mreturn\u001b[39;00m right\n",
      "\u001b[0;31mValueError\u001b[0m: Unable to coerce to Series, length must be 3: given 5"
     ]
    }
   ],
   "source": [
    "df.sub(ones) #Alerta!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>one</th>\n",
       "      <th>two</th>\n",
       "      <th>three</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.966021</td>\n",
       "      <td>-1.006231</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>-0.696733</td>\n",
       "      <td>0.045482</td>\n",
       "      <td>0.111803</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.836080</td>\n",
       "      <td>-0.961166</td>\n",
       "      <td>-1.006231</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>-1.254119</td>\n",
       "      <td>-1.059487</td>\n",
       "      <td>1.229837</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1.114773</td>\n",
       "      <td>1.009150</td>\n",
       "      <td>0.670820</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        one       two     three\n",
       "0  0.000000  0.966021 -1.006231\n",
       "1 -0.696733  0.045482  0.111803\n",
       "2  0.836080 -0.961166 -1.006231\n",
       "3 -1.254119 -1.059487  1.229837\n",
       "4  1.114773  1.009150  0.670820"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# nornalitzación z-score (media a 0 y desviación a 1)\n",
    "ts_stand = (df - df.mean()) / df.std()\n",
    "ts_stand"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "one      0.000000e+00\n",
      "two      2.664535e-16\n",
      "three    6.661338e-17\n",
      "dtype: float64\n",
      "----\n",
      "one      1.0\n",
      "two      1.0\n",
      "three    1.0\n",
      "dtype: float64\n"
     ]
    }
   ],
   "source": [
    "print(ts_stand.mean())\n",
    "print(\"----\")\n",
    "print(ts_stand.std())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Series no númericas"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0             Afghanistan\n",
       "1                 Albania\n",
       "2                 Algeria\n",
       "3                 Andorra\n",
       "4                  Angola\n",
       "              ...        \n",
       "197               Vietnam\n",
       "198    West Bank and Gaza\n",
       "199                 Yemen\n",
       "200                Zambia\n",
       "201              Zimbabwe\n",
       "Name: Country, Length: 202, dtype: object"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who[\"Country\"] "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0             afghanistan\n",
       "1                 albania\n",
       "2                 algeria\n",
       "3                 andorra\n",
       "4                  angola\n",
       "              ...        \n",
       "197               vietnam\n",
       "198    west bank and gaza\n",
       "199                 yemen\n",
       "200                zambia\n",
       "201              zimbabwe\n",
       "Name: Country, Length: 202, dtype: object"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#https://www.w3schools.com/python/python_ref_string.asp\n",
    "df_who.Country.str.casefold()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0            0Afghanistan\n",
       "1            00000Albania\n",
       "2            00000Algeria\n",
       "3            00000Andorra\n",
       "4            000000Angola\n",
       "              ...        \n",
       "197          00000Vietnam\n",
       "198    West Bank and Gaza\n",
       "199          0000000Yemen\n",
       "200          000000Zambia\n",
       "201          0000Zimbabwe\n",
       "Name: Country, Length: 202, dtype: object"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who.Country.str.zfill(12)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Modificación del dataframe\n",
    "\n",
    "Además de realizar selecciones, en algunos momentos necesitaremos incorporar nueva información a nuestras tablas de datos.\n",
    "\n",
    "Vamos a crear un pequeño conjunto para practicar:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Liho Liho</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>224.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Square</td>\n",
       "      <td>bar</td>\n",
       "      <td>25.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name        type  AvgBill\n",
       "0  Foreign Cinema  Restaurant    289.0\n",
       "1       Liho Liho  Restaurant    224.0\n",
       "2        500 Club         bar     80.5\n",
       "3      The Square         bar     25.3"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = pd.DataFrame([('Foreign Cinema', 'Restaurant', 289.0),\n",
    "                   ('Liho Liho', 'Restaurant', 224.0),\n",
    "                   ('500 Club', 'bar', 80.5),\n",
    "                   ('The Square', 'bar', 25.30)],\n",
    "           columns=('name', 'type', 'AvgBill')\n",
    "                 )\n",
    "df2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### Añadir columnas\n",
    "\n",
    "Tenemos diversas maneras de añadir columnas a un _dataFrame_:\n",
    "\n",
    "- Mediante el nombre de la columna que queremos añadir, tal como añadimos una nueva clave a un diccionario.\n",
    "- `insert`: es un método que necesita 3 parámetros. La posición en la que queremos añadir la columna (`loc`), su nombre (´column´) y la lista de valores (`value`).\n",
    "- `assign`: muy similar a la anterior, pero permite añadir múltiples columnas.\n",
    "- `concat`: no se suele usar para concatenar columnas, en el caso que queramos usarlo para este caso, deberemos poner el parámetro `axis=1`.\n",
    "\n",
    "Veamos algunos ejemplos:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "      <th>Day</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "      <td>Monday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Liho Liho</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>224.0</td>\n",
       "      <td>Monday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "      <td>Monday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Square</td>\n",
       "      <td>bar</td>\n",
       "      <td>25.3</td>\n",
       "      <td>Monday</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name        type  AvgBill     Day\n",
       "0  Foreign Cinema  Restaurant    289.0  Monday\n",
       "1       Liho Liho  Restaurant    224.0  Monday\n",
       "2        500 Club         bar     80.5  Monday\n",
       "3      The Square         bar     25.3  Monday"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2['Day'] = \"Monday\" # Como un diccionario\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "Length of values (3) does not match length of index (4)",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[39], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mdf2\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43mDay\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m]\u001b[49m \u001b[38;5;241m=\u001b[39m [\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mMonday\u001b[39m\u001b[38;5;124m'\u001b[39m, \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mTuesday\u001b[39m\u001b[38;5;124m'\u001b[39m, \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mWednesday\u001b[39m\u001b[38;5;124m'\u001b[39m]\n\u001b[1;32m      2\u001b[0m df2\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/frame.py:3980\u001b[0m, in \u001b[0;36mDataFrame.__setitem__\u001b[0;34m(self, key, value)\u001b[0m\n\u001b[1;32m   3977\u001b[0m     \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_setitem_array([key], value)\n\u001b[1;32m   3978\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m   3979\u001b[0m     \u001b[38;5;66;03m# set column\u001b[39;00m\n\u001b[0;32m-> 3980\u001b[0m     \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_set_item\u001b[49m\u001b[43m(\u001b[49m\u001b[43mkey\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mvalue\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/frame.py:4174\u001b[0m, in \u001b[0;36mDataFrame._set_item\u001b[0;34m(self, key, value)\u001b[0m\n\u001b[1;32m   4164\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m_set_item\u001b[39m(\u001b[38;5;28mself\u001b[39m, key, value) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m   4165\u001b[0m \u001b[38;5;250m    \u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m   4166\u001b[0m \u001b[38;5;124;03m    Add series to DataFrame in specified column.\u001b[39;00m\n\u001b[1;32m   4167\u001b[0m \n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m   4172\u001b[0m \u001b[38;5;124;03m    ensure homogeneity.\u001b[39;00m\n\u001b[1;32m   4173\u001b[0m \u001b[38;5;124;03m    \"\"\"\u001b[39;00m\n\u001b[0;32m-> 4174\u001b[0m     value \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_sanitize_column\u001b[49m\u001b[43m(\u001b[49m\u001b[43mvalue\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m   4176\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m (\n\u001b[1;32m   4177\u001b[0m         key \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mcolumns\n\u001b[1;32m   4178\u001b[0m         \u001b[38;5;129;01mand\u001b[39;00m value\u001b[38;5;241m.\u001b[39mndim \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m1\u001b[39m\n\u001b[1;32m   4179\u001b[0m         \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m is_extension_array_dtype(value)\n\u001b[1;32m   4180\u001b[0m     ):\n\u001b[1;32m   4181\u001b[0m         \u001b[38;5;66;03m# broadcast across multiple columns if necessary\u001b[39;00m\n\u001b[1;32m   4182\u001b[0m         \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mcolumns\u001b[38;5;241m.\u001b[39mis_unique \u001b[38;5;129;01mor\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mcolumns, MultiIndex):\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/frame.py:4915\u001b[0m, in \u001b[0;36mDataFrame._sanitize_column\u001b[0;34m(self, value)\u001b[0m\n\u001b[1;32m   4912\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m _reindex_for_setitem(Series(value), \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mindex)\n\u001b[1;32m   4914\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_list_like(value):\n\u001b[0;32m-> 4915\u001b[0m     \u001b[43mcom\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrequire_length_match\u001b[49m\u001b[43m(\u001b[49m\u001b[43mvalue\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mindex\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m   4916\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m sanitize_array(value, \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mindex, copy\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m, allow_2d\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m)\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/common.py:571\u001b[0m, in \u001b[0;36mrequire_length_match\u001b[0;34m(data, index)\u001b[0m\n\u001b[1;32m    567\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m    568\u001b[0m \u001b[38;5;124;03mCheck the length of data matches the length of the index.\u001b[39;00m\n\u001b[1;32m    569\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m    570\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(data) \u001b[38;5;241m!=\u001b[39m \u001b[38;5;28mlen\u001b[39m(index):\n\u001b[0;32m--> 571\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[1;32m    572\u001b[0m         \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mLength of values \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m    573\u001b[0m         \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m(\u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mlen\u001b[39m(data)\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m) \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m    574\u001b[0m         \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdoes not match length of index \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m    575\u001b[0m         \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m(\u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mlen\u001b[39m(index)\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m)\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m    576\u001b[0m     )\n",
      "\u001b[0;31mValueError\u001b[0m: Length of values (3) does not match length of index (4)"
     ]
    }
   ],
   "source": [
    "df2['Day'] = ['Monday', 'Tuesday', 'Wednesday']\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>Stars</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "      <th>Day</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "      <td>Monday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Liho Liho</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>224.0</td>\n",
       "      <td>Monday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>3</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "      <td>Monday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Square</td>\n",
       "      <td>4</td>\n",
       "      <td>bar</td>\n",
       "      <td>25.3</td>\n",
       "      <td>Monday</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name  Stars        type  AvgBill     Day\n",
       "0  Foreign Cinema      2  Restaurant    289.0  Monday\n",
       "1       Liho Liho      2  Restaurant    224.0  Monday\n",
       "2        500 Club      3         bar     80.5  Monday\n",
       "3      The Square      4         bar     25.3  Monday"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Vamos a usar el método insert\n",
    "df2.insert(loc=1, column=\"Stars\", value=[2,2,3,4])\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>Stars</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "      <th>Day</th>\n",
       "      <th>AvgBillIVA</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "      <td>Monday</td>\n",
       "      <td>349.690</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Liho Liho</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>224.0</td>\n",
       "      <td>Monday</td>\n",
       "      <td>271.040</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>3</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "      <td>Monday</td>\n",
       "      <td>97.405</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Square</td>\n",
       "      <td>4</td>\n",
       "      <td>bar</td>\n",
       "      <td>25.3</td>\n",
       "      <td>Monday</td>\n",
       "      <td>30.613</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name  Stars        type  AvgBill     Day  AvgBillIVA\n",
       "0  Foreign Cinema      2  Restaurant    289.0  Monday     349.690\n",
       "1       Liho Liho      2  Restaurant    224.0  Monday     271.040\n",
       "2        500 Club      3         bar     80.5  Monday      97.405\n",
       "3      The Square      4         bar     25.3  Monday      30.613"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2[\"AvgBillIVA\"] = df2[\"AvgBill\"]*1.21\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "df3 = df2.assign(AvgHalfBill=df2.AvgBill / 2, Michelin_Star=3)\n",
    "df3\n",
    "\n",
    "df3[\"HOLA\"] = df3.name.str.capitalize()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>Stars</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "      <th>Day</th>\n",
       "      <th>AvgHalfBill</th>\n",
       "      <th>Michelin_Star</th>\n",
       "      <th>HOLA</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "      <td>Monday</td>\n",
       "      <td>144.50</td>\n",
       "      <td>3</td>\n",
       "      <td>Foreign cinema</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Liho Liho</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>224.0</td>\n",
       "      <td>Tuesday</td>\n",
       "      <td>112.00</td>\n",
       "      <td>3</td>\n",
       "      <td>Liho liho</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>3</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "      <td>Wednesday</td>\n",
       "      <td>40.25</td>\n",
       "      <td>3</td>\n",
       "      <td>500 club</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Square</td>\n",
       "      <td>4</td>\n",
       "      <td>bar</td>\n",
       "      <td>25.3</td>\n",
       "      <td>Thursday</td>\n",
       "      <td>12.65</td>\n",
       "      <td>3</td>\n",
       "      <td>The square</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name  Stars        type  AvgBill        Day  AvgHalfBill  \\\n",
       "0  Foreign Cinema      2  Restaurant    289.0     Monday       144.50   \n",
       "1       Liho Liho      2  Restaurant    224.0    Tuesday       112.00   \n",
       "2        500 Club      3         bar     80.5  Wednesday        40.25   \n",
       "3      The Square      4         bar     25.3   Thursday        12.65   \n",
       "\n",
       "   Michelin_Star            HOLA  \n",
       "0              3  Foreign cinema  \n",
       "1              3       Liho liho  \n",
       "2              3        500 club  \n",
       "3              3      The square  "
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Añadir filas\n",
    "\n",
    "Para agregar una nueva fila a un DataFrame en pandas, una forma común es crear un DataFrame con esa fila y luego unirlo al original usando la función `pd.concat()`. **Ejemplo**:\n",
    "\n",
    "```python\n",
    "import pandas as pd\n",
    "\n",
    "# DataFrame original\n",
    "df = pd.DataFrame({\n",
    "    \"nombre\": [\"Ana\", \"Luis\"],\n",
    "    \"edad\": [28, 34]\n",
    "})\n",
    "\n",
    "# Nueva fila como DataFrame\n",
    "nueva_fila = pd.DataFrame({\n",
    "    \"nombre\": [\"Carlos\"],\n",
    "    \"edad\": [30]\n",
    "})\n",
    "\n",
    "# Concatenar\n",
    "df = pd.concat([df, nueva_fila], ignore_index=True)\n",
    "```\n",
    "\n",
    "La función `pandas.concat()` sirve para unir objetos de tipo Series o DataFrame. Para añadir una fila:\n",
    "- Se pasa una lista que contiene el DataFrame original y el nuevo DataFrame.\n",
    "- `ignore_index=True` hace que se reajusten los índices del resultado.\n",
    "- El método no modifica los DataFrames originales a menos que reasignes el resultado."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "\n",
    "\n",
    "#### Eliminar filas y columnas\n",
    "\n",
    "\n",
    "Tenemos el método `drop` que nos proporciona un nuevo _dataFrame_ sin la(s) fila(s) o la(s) columna(s) que seleccionemos. \n",
    "Si queremos eliminar columnas podemos hacerlo especificando la lista de columnas en el parámetro `columns` de la siguiente manera:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>Stars</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "      <th>Day</th>\n",
       "      <th>AvgHalfBill</th>\n",
       "      <th>Michelin_Star</th>\n",
       "      <th>HOLA</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "      <td>Monday</td>\n",
       "      <td>144.50</td>\n",
       "      <td>3</td>\n",
       "      <td>Foreign cinema</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Liho Liho</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>224.0</td>\n",
       "      <td>Monday</td>\n",
       "      <td>112.00</td>\n",
       "      <td>3</td>\n",
       "      <td>Liho liho</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>3</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "      <td>Monday</td>\n",
       "      <td>40.25</td>\n",
       "      <td>3</td>\n",
       "      <td>500 club</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Square</td>\n",
       "      <td>4</td>\n",
       "      <td>bar</td>\n",
       "      <td>25.3</td>\n",
       "      <td>Monday</td>\n",
       "      <td>12.65</td>\n",
       "      <td>3</td>\n",
       "      <td>The square</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name  Stars        type  AvgBill     Day  AvgHalfBill  \\\n",
       "0  Foreign Cinema      2  Restaurant    289.0  Monday       144.50   \n",
       "1       Liho Liho      2  Restaurant    224.0  Monday       112.00   \n",
       "2        500 Club      3         bar     80.5  Monday        40.25   \n",
       "3      The Square      4         bar     25.3  Monday        12.65   \n",
       "\n",
       "   Michelin_Star            HOLA  \n",
       "0              3  Foreign cinema  \n",
       "1              3       Liho liho  \n",
       "2              3        500 club  \n",
       "3              3      The square  "
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "\"['Stars'] not found in axis\"",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[49], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mdf3\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdrop\u001b[49m\u001b[43m(\u001b[49m\u001b[43mcolumns\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mStars\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\u001b[43minplace\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m)\u001b[49m \u001b[38;5;66;03m# Eliminamos la última columna que hemos creado\u001b[39;00m\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/util/_decorators.py:331\u001b[0m, in \u001b[0;36mdeprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m    325\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(args) \u001b[38;5;241m>\u001b[39m num_allow_args:\n\u001b[1;32m    326\u001b[0m     warnings\u001b[38;5;241m.\u001b[39mwarn(\n\u001b[1;32m    327\u001b[0m         msg\u001b[38;5;241m.\u001b[39mformat(arguments\u001b[38;5;241m=\u001b[39m_format_argument_list(allow_args)),\n\u001b[1;32m    328\u001b[0m         \u001b[38;5;167;01mFutureWarning\u001b[39;00m,\n\u001b[1;32m    329\u001b[0m         stacklevel\u001b[38;5;241m=\u001b[39mfind_stack_level(),\n\u001b[1;32m    330\u001b[0m     )\n\u001b[0;32m--> 331\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/frame.py:5399\u001b[0m, in \u001b[0;36mDataFrame.drop\u001b[0;34m(self, labels, axis, index, columns, level, inplace, errors)\u001b[0m\n\u001b[1;32m   5251\u001b[0m \u001b[38;5;129m@deprecate_nonkeyword_arguments\u001b[39m(version\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mNone\u001b[39;00m, allowed_args\u001b[38;5;241m=\u001b[39m[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mself\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mlabels\u001b[39m\u001b[38;5;124m\"\u001b[39m])\n\u001b[1;32m   5252\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mdrop\u001b[39m(  \u001b[38;5;66;03m# type: ignore[override]\u001b[39;00m\n\u001b[1;32m   5253\u001b[0m     \u001b[38;5;28mself\u001b[39m,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m   5260\u001b[0m     errors: IgnoreRaise \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mraise\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m   5261\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m DataFrame \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m   5262\u001b[0m \u001b[38;5;250m    \u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m   5263\u001b[0m \u001b[38;5;124;03m    Drop specified labels from rows or columns.\u001b[39;00m\n\u001b[1;32m   5264\u001b[0m \n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m   5397\u001b[0m \u001b[38;5;124;03m            weight  1.0     0.8\u001b[39;00m\n\u001b[1;32m   5398\u001b[0m \u001b[38;5;124;03m    \"\"\"\u001b[39;00m\n\u001b[0;32m-> 5399\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdrop\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m   5400\u001b[0m \u001b[43m        \u001b[49m\u001b[43mlabels\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mlabels\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   5401\u001b[0m \u001b[43m        \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   5402\u001b[0m \u001b[43m        \u001b[49m\u001b[43mindex\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mindex\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   5403\u001b[0m \u001b[43m        \u001b[49m\u001b[43mcolumns\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcolumns\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   5404\u001b[0m \u001b[43m        \u001b[49m\u001b[43mlevel\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mlevel\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   5405\u001b[0m \u001b[43m        \u001b[49m\u001b[43minplace\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43minplace\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   5406\u001b[0m \u001b[43m        \u001b[49m\u001b[43merrors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43merrors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   5407\u001b[0m \u001b[43m    \u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/util/_decorators.py:331\u001b[0m, in \u001b[0;36mdeprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m    325\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(args) \u001b[38;5;241m>\u001b[39m num_allow_args:\n\u001b[1;32m    326\u001b[0m     warnings\u001b[38;5;241m.\u001b[39mwarn(\n\u001b[1;32m    327\u001b[0m         msg\u001b[38;5;241m.\u001b[39mformat(arguments\u001b[38;5;241m=\u001b[39m_format_argument_list(allow_args)),\n\u001b[1;32m    328\u001b[0m         \u001b[38;5;167;01mFutureWarning\u001b[39;00m,\n\u001b[1;32m    329\u001b[0m         stacklevel\u001b[38;5;241m=\u001b[39mfind_stack_level(),\n\u001b[1;32m    330\u001b[0m     )\n\u001b[0;32m--> 331\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/generic.py:4505\u001b[0m, in \u001b[0;36mNDFrame.drop\u001b[0;34m(self, labels, axis, index, columns, level, inplace, errors)\u001b[0m\n\u001b[1;32m   4503\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m axis, labels \u001b[38;5;129;01min\u001b[39;00m axes\u001b[38;5;241m.\u001b[39mitems():\n\u001b[1;32m   4504\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m labels \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[0;32m-> 4505\u001b[0m         obj \u001b[38;5;241m=\u001b[39m \u001b[43mobj\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_drop_axis\u001b[49m\u001b[43m(\u001b[49m\u001b[43mlabels\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mlevel\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mlevel\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43merrors\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m   4507\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m inplace:\n\u001b[1;32m   4508\u001b[0m     \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_update_inplace(obj)\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/generic.py:4546\u001b[0m, in \u001b[0;36mNDFrame._drop_axis\u001b[0;34m(self, labels, axis, level, errors, only_slice)\u001b[0m\n\u001b[1;32m   4544\u001b[0m         new_axis \u001b[38;5;241m=\u001b[39m axis\u001b[38;5;241m.\u001b[39mdrop(labels, level\u001b[38;5;241m=\u001b[39mlevel, errors\u001b[38;5;241m=\u001b[39merrors)\n\u001b[1;32m   4545\u001b[0m     \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m-> 4546\u001b[0m         new_axis \u001b[38;5;241m=\u001b[39m \u001b[43maxis\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdrop\u001b[49m\u001b[43m(\u001b[49m\u001b[43mlabels\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43merrors\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m   4547\u001b[0m     indexer \u001b[38;5;241m=\u001b[39m axis\u001b[38;5;241m.\u001b[39mget_indexer(new_axis)\n\u001b[1;32m   4549\u001b[0m \u001b[38;5;66;03m# Case for non-unique axis\u001b[39;00m\n\u001b[1;32m   4550\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/indexes/base.py:6934\u001b[0m, in \u001b[0;36mIndex.drop\u001b[0;34m(self, labels, errors)\u001b[0m\n\u001b[1;32m   6932\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m mask\u001b[38;5;241m.\u001b[39many():\n\u001b[1;32m   6933\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m errors \u001b[38;5;241m!=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mignore\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[0;32m-> 6934\u001b[0m         \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mKeyError\u001b[39;00m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mlist\u001b[39m(labels[mask])\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m not found in axis\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m   6935\u001b[0m     indexer \u001b[38;5;241m=\u001b[39m indexer[\u001b[38;5;241m~\u001b[39mmask]\n\u001b[1;32m   6936\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mdelete(indexer)\n",
      "\u001b[0;31mKeyError\u001b[0m: \"['Stars'] not found in axis\""
     ]
    }
   ],
   "source": [
    "df3.drop(columns=[\"Stars\"],inplace=True) # Eliminamos la última columna que hemos creado\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "      <th>Day</th>\n",
       "      <th>AvgHalfBill</th>\n",
       "      <th>Michelin_Star</th>\n",
       "      <th>HOLA</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "      <td>Monday</td>\n",
       "      <td>144.50</td>\n",
       "      <td>3</td>\n",
       "      <td>Foreign cinema</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Liho Liho</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>224.0</td>\n",
       "      <td>Monday</td>\n",
       "      <td>112.00</td>\n",
       "      <td>3</td>\n",
       "      <td>Liho liho</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "      <td>Monday</td>\n",
       "      <td>40.25</td>\n",
       "      <td>3</td>\n",
       "      <td>500 club</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Square</td>\n",
       "      <td>bar</td>\n",
       "      <td>25.3</td>\n",
       "      <td>Monday</td>\n",
       "      <td>12.65</td>\n",
       "      <td>3</td>\n",
       "      <td>The square</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name        type  AvgBill     Day  AvgHalfBill  Michelin_Star  \\\n",
       "0  Foreign Cinema  Restaurant    289.0  Monday       144.50              3   \n",
       "1       Liho Liho  Restaurant    224.0  Monday       112.00              3   \n",
       "2        500 Club         bar     80.5  Monday        40.25              3   \n",
       "3      The Square         bar     25.3  Monday        12.65              3   \n",
       "\n",
       "             HOLA  \n",
       "0  Foreign cinema  \n",
       "1       Liho liho  \n",
       "2        500 club  \n",
       "3      The square  "
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Para poder eliminar filas, usamos la misma función, esta vez sin el parámetro que hemos usado anteriormente, simplemente indicamos los índices a eliminar:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>Stars</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "      <th>Day</th>\n",
       "      <th>AvgHalfBill</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "      <td>Monday</td>\n",
       "      <td>144.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>3</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "      <td>Wednesday</td>\n",
       "      <td>40.25</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name  Stars        type  AvgBill        Day  AvgHalfBill\n",
       "0  Foreign Cinema      2  Restaurant    289.0     Monday       144.50\n",
       "2        500 Club      3         bar     80.5  Wednesday        40.25"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_less_rows = df_no_michelin.drop([1,3])\n",
    "df_less_rows"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Eliminación de filas según un criterio\n",
    "\n",
    "En ocasiones, al trabajar con datos es necesario depurar un `DataFrame` eliminando aquellas filas que no cumplen ciertas condiciones o que contienen valores no deseados. La eliminación de filas según un criterio permite filtrar la información de forma eficiente, manteniendo únicamente los registros relevantes para el análisis. En pandas, este proceso suele realizarse mediante operaciones lógicas y el filtrado booleano. Estas herramientas facilitan un control preciso sobre qué datos se conservan y cuáles se descartan."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>Stars</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "      <th>Day</th>\n",
       "      <th>AvgHalfBill</th>\n",
       "      <th>Michelin_Star</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>3</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "      <td>Wednesday</td>\n",
       "      <td>40.25</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Square</td>\n",
       "      <td>4</td>\n",
       "      <td>bar</td>\n",
       "      <td>25.3</td>\n",
       "      <td>Thursday</td>\n",
       "      <td>12.65</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         name  Stars type  AvgBill        Day  AvgHalfBill  Michelin_Star\n",
       "2    500 Club      3  bar     80.5  Wednesday        40.25              3\n",
       "3  The Square      4  bar     25.3   Thursday        12.65              3"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dfs1 = df3[df3.Stars>=3] # per una selecció?\n",
    "dfs1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Int64Index([2, 3], dtype='int64')"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3[df3.Stars>=3].index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>Stars</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "      <th>Day</th>\n",
       "      <th>AvgHalfBill</th>\n",
       "      <th>Michelin_Star</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "      <td>Monday</td>\n",
       "      <td>144.5</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Liho Liho</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>224.0</td>\n",
       "      <td>Tuesday</td>\n",
       "      <td>112.0</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name  Stars        type  AvgBill      Day  AvgHalfBill  \\\n",
       "0  Foreign Cinema      2  Restaurant    289.0   Monday        144.5   \n",
       "1       Liho Liho      2  Restaurant    224.0  Tuesday        112.0   \n",
       "\n",
       "   Michelin_Star  \n",
       "0              3  \n",
       "1              3  "
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3.drop(df3[df3.Stars>=3].index)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>Stars</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "      <th>Day</th>\n",
       "      <th>AvgHalfBill</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>2</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "      <td>Monday</td>\n",
       "      <td>144.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>3</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "      <td>Wednesday</td>\n",
       "      <td>40.25</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name  Stars        type  AvgBill        Day  AvgHalfBill\n",
       "0  Foreign Cinema      2  Restaurant    289.0     Monday       144.50\n",
       "2        500 Club      3         bar     80.5  Wednesday        40.25"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3.drop(df3[df3.Stars>=3].index, inplace=True) #Alerta con la integración de los cambios\n",
    "df3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Modificar el valor de un fila según un criterio\n",
    "\n",
    "En algunos casos es necesario actualizar los valores de determinadas filas que cumplan una condición específica. En pandas, esto se logra fácilmente usando filtrado booleano junto con asignaciones directas. Este método permite cambiar solo las filas que coincidan con un criterio, manteniendo el resto del *DataFrame* intacto. **Ejemplo**:\n",
    "\n",
    "```python\n",
    "import pandas as pd\n",
    "\n",
    "# DataFrame original\n",
    "df = pd.DataFrame({\n",
    "    \"nombre\": [\"Ana\", \"Luis\", \"Carlos\"],\n",
    "    \"edad\": [28, 34, 30]\n",
    "})\n",
    "\n",
    "# Modificar la edad de la fila donde el nombre es \"Luis\"\n",
    "df.loc[df[\"nombre\"] == \"Luis\", \"edad\"] = 35\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Concatenación y unión de dataframes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A veces, los datos vienen en diferentes archivos y necesitan ser combinados en un único archivo, este proceso implica la concatenación de dataframes. Otras veces, los datos son complementarios, es decir, hay nuevas columnas en un dataframe, y esto se conoce como realizar operaciones de unión (joins)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Concatenación"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Liho Liho</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>224.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Square</td>\n",
       "      <td>bar</td>\n",
       "      <td>25.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name        type  AvgBill\n",
       "0  Foreign Cinema  Restaurant    289.0\n",
       "1       Liho Liho  Restaurant    224.0\n",
       "2        500 Club         bar     80.5\n",
       "3      The Square         bar     25.3"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame([('Foreign Cinema', 'Restaurant', 289.0),\n",
    "                   ('Liho Liho', 'Restaurant', 224.0),\n",
    "                   ('500 Club', 'bar', 80.5),\n",
    "                   ('The Square', 'bar', 25.30)],\n",
    "           columns=('name', 'type', 'AvgBill')\n",
    "                 )\n",
    "df1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name2</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Biels</td>\n",
       "      <td>Quiosquet</td>\n",
       "      <td>389.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Barkus</td>\n",
       "      <td>Bar</td>\n",
       "      <td>24.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Blue Wall</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Bounty Hunters</td>\n",
       "      <td>Social Club</td>\n",
       "      <td>125.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            name2         type  AvgBill\n",
       "0           Biels    Quiosquet    389.0\n",
       "1          Barkus          Bar     24.0\n",
       "2       Blue Wall          bar     80.5\n",
       "3  Bounty Hunters  Social Club    125.3"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = pd.DataFrame([('Biels', 'Quiosquet', 389.0),\n",
    "                   ('Barkus', 'Bar', 24.0),\n",
    "                   ('Blue Wall', 'bar', 80.5),\n",
    "                   ('Bounty Hunters', 'Social Club', 125.30)],\n",
    "           columns=('name2', 'type', 'AvgBill')\n",
    "                 )\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "      <th>name2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Liho Liho</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>224.0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>500 Club</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Square</td>\n",
       "      <td>bar</td>\n",
       "      <td>25.3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>NaN</td>\n",
       "      <td>Quiosquet</td>\n",
       "      <td>389.0</td>\n",
       "      <td>Biels</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>NaN</td>\n",
       "      <td>Bar</td>\n",
       "      <td>24.0</td>\n",
       "      <td>Barkus</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "      <td>Blue Wall</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>Social Club</td>\n",
       "      <td>125.3</td>\n",
       "      <td>Bounty Hunters</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name         type  AvgBill           name2\n",
       "0  Foreign Cinema   Restaurant    289.0             NaN\n",
       "1       Liho Liho   Restaurant    224.0             NaN\n",
       "2        500 Club          bar     80.5             NaN\n",
       "3      The Square          bar     25.3             NaN\n",
       "0             NaN    Quiosquet    389.0           Biels\n",
       "1             NaN          Bar     24.0          Barkus\n",
       "2             NaN          bar     80.5       Blue Wall\n",
       "3             NaN  Social Club    125.3  Bounty Hunters"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dfAll = pd.concat([df1,df2])\n",
    "dfAll"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>index</th>\n",
       "      <th>name</th>\n",
       "      <th>type</th>\n",
       "      <th>AvgBill</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>Foreign Cinema</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>289.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>Liho Liho</td>\n",
       "      <td>Restaurant</td>\n",
       "      <td>224.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>500 Club</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>The Square</td>\n",
       "      <td>bar</td>\n",
       "      <td>25.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>Biels</td>\n",
       "      <td>Quiosquet</td>\n",
       "      <td>389.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>1</td>\n",
       "      <td>Barkus</td>\n",
       "      <td>Bar</td>\n",
       "      <td>24.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>2</td>\n",
       "      <td>Blue Wall</td>\n",
       "      <td>bar</td>\n",
       "      <td>80.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>3</td>\n",
       "      <td>Bounty Hunters</td>\n",
       "      <td>Social Club</td>\n",
       "      <td>125.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   index            name         type  AvgBill\n",
       "0      0  Foreign Cinema   Restaurant    289.0\n",
       "1      1       Liho Liho   Restaurant    224.0\n",
       "2      2        500 Club          bar     80.5\n",
       "3      3      The Square          bar     25.3\n",
       "4      0           Biels    Quiosquet    389.0\n",
       "5      1          Barkus          Bar     24.0\n",
       "6      2       Blue Wall          bar     80.5\n",
       "7      3  Bounty Hunters  Social Club    125.3"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dfAll = dfAll.reset_index()\n",
    "dfAll"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dfAll.drop(columns=[\"index\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# https://pandas.pydata.org/docs/reference/api/pandas.concat.html\n",
    "dfAll = pd.concat([df1,df2],ignore_index=True)\n",
    "dfAll"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Unión\n",
    "\n",
    "Más información: https://pandas.pydata.org/docs/user_guide/merging.html"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>ID</th>\n",
       "      <th>Country</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1</td>\n",
       "      <td>Italy</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Pep</td>\n",
       "      <td>2</td>\n",
       "      <td>Germany</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4</td>\n",
       "      <td>Italy</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      name  ID  Country\n",
       "0     Jhon   1    Italy\n",
       "1      Pep   2  Germany\n",
       "2  William   3  Finland\n",
       "3    Snake   4    Italy"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame([('Jhon', 1, \"Italy\"),\n",
    "                   ('Pep', 2, \"Germany\"),\n",
    "                   ('William', 3, \"Finland\"),\n",
    "                   ('Snake', 4, \"Italy\")],\n",
    "           columns=('name', 'ID', 'Country')\n",
    "                 )\n",
    "df1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>DNI</th>\n",
       "      <th>Weight</th>\n",
       "      <th>Salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>145.0</td>\n",
       "      <td>3000.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>189.2</td>\n",
       "      <td>2030.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>129.0</td>\n",
       "      <td>3000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>198.1</td>\n",
       "      <td>4020.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   DNI  Weight  Salary\n",
       "0    1   145.0  3000.1\n",
       "1    2   189.2  2030.2\n",
       "2    3   129.0  3000.0\n",
       "3    4   198.1  4020.2"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = pd.DataFrame([(1, 145.0, 3000.1),\n",
    "                   ( 2, 189.2, 2030.2),\n",
    "                   ( 3, 129.0, 3000.0),\n",
    "                   ( 4, 198.1, 4020.2)],\n",
    "           columns=('DNI', 'Weight', 'Salary')\n",
    "                 )\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "ename": "MergeError",
     "evalue": "No common columns to perform merge on. Merge options: left_on=None, right_on=None, left_index=False, right_index=False",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mMergeError\u001b[0m                                Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[56], line 3\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[38;5;66;03m# https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html\u001b[39;00m\n\u001b[0;32m----> 3\u001b[0m \u001b[43mdf1\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmerge\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdf2\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/frame.py:10093\u001b[0m, in \u001b[0;36mDataFrame.merge\u001b[0;34m(self, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)\u001b[0m\n\u001b[1;32m  10074\u001b[0m \u001b[38;5;129m@Substitution\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m  10075\u001b[0m \u001b[38;5;129m@Appender\u001b[39m(_merge_doc, indents\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m2\u001b[39m)\n\u001b[1;32m  10076\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mmerge\u001b[39m(\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m  10089\u001b[0m     validate: \u001b[38;5;28mstr\u001b[39m \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m,\n\u001b[1;32m  10090\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m DataFrame:\n\u001b[1;32m  10091\u001b[0m     \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01mpandas\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mcore\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mreshape\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mmerge\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m merge\n\u001b[0;32m> 10093\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mmerge\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m  10094\u001b[0m \u001b[43m        \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10095\u001b[0m \u001b[43m        \u001b[49m\u001b[43mright\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10096\u001b[0m \u001b[43m        \u001b[49m\u001b[43mhow\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mhow\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10097\u001b[0m \u001b[43m        \u001b[49m\u001b[43mon\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mon\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10098\u001b[0m \u001b[43m        \u001b[49m\u001b[43mleft_on\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mleft_on\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10099\u001b[0m \u001b[43m        \u001b[49m\u001b[43mright_on\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mright_on\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10100\u001b[0m \u001b[43m        \u001b[49m\u001b[43mleft_index\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mleft_index\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10101\u001b[0m \u001b[43m        \u001b[49m\u001b[43mright_index\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mright_index\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10102\u001b[0m \u001b[43m        \u001b[49m\u001b[43msort\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43msort\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10103\u001b[0m \u001b[43m        \u001b[49m\u001b[43msuffixes\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43msuffixes\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10104\u001b[0m \u001b[43m        \u001b[49m\u001b[43mcopy\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcopy\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10105\u001b[0m \u001b[43m        \u001b[49m\u001b[43mindicator\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mindicator\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10106\u001b[0m \u001b[43m        \u001b[49m\u001b[43mvalidate\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mvalidate\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m  10107\u001b[0m \u001b[43m    \u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/reshape/merge.py:110\u001b[0m, in \u001b[0;36mmerge\u001b[0;34m(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)\u001b[0m\n\u001b[1;32m     93\u001b[0m \u001b[38;5;129m@Substitution\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;130;01m\\n\u001b[39;00m\u001b[38;5;124mleft : DataFrame or named Series\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m     94\u001b[0m \u001b[38;5;129m@Appender\u001b[39m(_merge_doc, indents\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m0\u001b[39m)\n\u001b[1;32m     95\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mmerge\u001b[39m(\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    108\u001b[0m     validate: \u001b[38;5;28mstr\u001b[39m \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m,\n\u001b[1;32m    109\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m DataFrame:\n\u001b[0;32m--> 110\u001b[0m     op \u001b[38;5;241m=\u001b[39m \u001b[43m_MergeOperation\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    111\u001b[0m \u001b[43m        \u001b[49m\u001b[43mleft\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    112\u001b[0m \u001b[43m        \u001b[49m\u001b[43mright\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    113\u001b[0m \u001b[43m        \u001b[49m\u001b[43mhow\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mhow\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    114\u001b[0m \u001b[43m        \u001b[49m\u001b[43mon\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mon\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    115\u001b[0m \u001b[43m        \u001b[49m\u001b[43mleft_on\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mleft_on\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    116\u001b[0m \u001b[43m        \u001b[49m\u001b[43mright_on\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mright_on\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    117\u001b[0m \u001b[43m        \u001b[49m\u001b[43mleft_index\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mleft_index\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    118\u001b[0m \u001b[43m        \u001b[49m\u001b[43mright_index\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mright_index\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    119\u001b[0m \u001b[43m        \u001b[49m\u001b[43msort\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43msort\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    120\u001b[0m \u001b[43m        \u001b[49m\u001b[43msuffixes\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43msuffixes\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    121\u001b[0m \u001b[43m        \u001b[49m\u001b[43mindicator\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mindicator\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    122\u001b[0m \u001b[43m        \u001b[49m\u001b[43mvalidate\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mvalidate\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    123\u001b[0m \u001b[43m    \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    124\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m op\u001b[38;5;241m.\u001b[39mget_result(copy\u001b[38;5;241m=\u001b[39mcopy)\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/reshape/merge.py:685\u001b[0m, in \u001b[0;36m_MergeOperation.__init__\u001b[0;34m(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, indicator, validate)\u001b[0m\n\u001b[1;32m    681\u001b[0m     \u001b[38;5;66;03m# stacklevel chosen to be correct when this is reached via pd.merge\u001b[39;00m\n\u001b[1;32m    682\u001b[0m     \u001b[38;5;66;03m# (and not DataFrame.join)\u001b[39;00m\n\u001b[1;32m    683\u001b[0m     warnings\u001b[38;5;241m.\u001b[39mwarn(msg, \u001b[38;5;167;01mFutureWarning\u001b[39;00m, stacklevel\u001b[38;5;241m=\u001b[39mfind_stack_level())\n\u001b[0;32m--> 685\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mleft_on, \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mright_on \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_validate_left_right_on\u001b[49m\u001b[43m(\u001b[49m\u001b[43mleft_on\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mright_on\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    687\u001b[0m cross_col \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[1;32m    688\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mhow \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcross\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n",
      "File \u001b[0;32m~/.pyenv/versions/3.11.0rc2/envs/my3110/lib/python3.11/site-packages/pandas/core/reshape/merge.py:1434\u001b[0m, in \u001b[0;36m_MergeOperation._validate_left_right_on\u001b[0;34m(self, left_on, right_on)\u001b[0m\n\u001b[1;32m   1432\u001b[0m common_cols \u001b[38;5;241m=\u001b[39m left_cols\u001b[38;5;241m.\u001b[39mintersection(right_cols)\n\u001b[1;32m   1433\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(common_cols) \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m0\u001b[39m:\n\u001b[0;32m-> 1434\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m MergeError(\n\u001b[1;32m   1435\u001b[0m         \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mNo common columns to perform merge on. \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m   1436\u001b[0m         \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mMerge options: left_on=\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mleft_on\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m, \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m   1437\u001b[0m         \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mright_on=\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mright_on\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m, \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m   1438\u001b[0m         \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mleft_index=\u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mleft_index\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m, \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m   1439\u001b[0m         \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mright_index=\u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mright_index\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m   1440\u001b[0m     )\n\u001b[1;32m   1441\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (\n\u001b[1;32m   1442\u001b[0m     \u001b[38;5;129;01mnot\u001b[39;00m left_cols\u001b[38;5;241m.\u001b[39mjoin(common_cols, how\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124minner\u001b[39m\u001b[38;5;124m\"\u001b[39m)\u001b[38;5;241m.\u001b[39mis_unique\n\u001b[1;32m   1443\u001b[0m     \u001b[38;5;129;01mor\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m right_cols\u001b[38;5;241m.\u001b[39mjoin(common_cols, how\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124minner\u001b[39m\u001b[38;5;124m\"\u001b[39m)\u001b[38;5;241m.\u001b[39mis_unique\n\u001b[1;32m   1444\u001b[0m ):\n\u001b[1;32m   1445\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m MergeError(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mData columns not unique: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mrepr\u001b[39m(common_cols)\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n",
      "\u001b[0;31mMergeError\u001b[0m: No common columns to perform merge on. Merge options: left_on=None, right_on=None, left_index=False, right_index=False"
     ]
    }
   ],
   "source": [
    "# https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html\n",
    "\n",
    "df1.merge(df2) ## ALERTA ! Necesita más parametros"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>ID</th>\n",
       "      <th>Country</th>\n",
       "      <th>DNI</th>\n",
       "      <th>Weight</th>\n",
       "      <th>Salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1</td>\n",
       "      <td>Italy</td>\n",
       "      <td>1</td>\n",
       "      <td>145.0</td>\n",
       "      <td>3000.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Pep</td>\n",
       "      <td>2</td>\n",
       "      <td>Germany</td>\n",
       "      <td>2</td>\n",
       "      <td>189.2</td>\n",
       "      <td>2030.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>129.0</td>\n",
       "      <td>3000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4</td>\n",
       "      <td>Italy</td>\n",
       "      <td>4</td>\n",
       "      <td>198.1</td>\n",
       "      <td>4020.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      name  ID  Country  DNI  Weight  Salary\n",
       "0     Jhon   1    Italy    1   145.0  3000.1\n",
       "1      Pep   2  Germany    2   189.2  2030.2\n",
       "2  William   3  Finland    3   129.0  3000.0\n",
       "3    Snake   4    Italy    4   198.1  4020.2"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1.merge(df2, left_on='ID', right_on='DNI')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>DNI</th>\n",
       "      <th>Weight</th>\n",
       "      <th>Salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>145.0</td>\n",
       "      <td>3000.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3</td>\n",
       "      <td>129.0</td>\n",
       "      <td>3000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>159.0</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>109.0</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>198.1</td>\n",
       "      <td>4020.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>5</td>\n",
       "      <td>200.0</td>\n",
       "      <td>5000.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   DNI  Weight  Salary\n",
       "0    1   145.0  3000.1\n",
       "1    3   129.0  3000.0\n",
       "2    3   159.0  4000.0\n",
       "3    3   109.0  5000.0\n",
       "4    4   198.1  4020.2\n",
       "5    5   200.0  5000.2"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = pd.DataFrame([(1, 145.0, 3000.1), #No 2\n",
    "                   ( 3, 129.0, 3000.0), # Multiples 3\n",
    "                   ( 3, 159.0, 4000.0),\n",
    "                   ( 3, 109.0, 5000.0),\n",
    "                   ( 4, 198.1, 4020.2),\n",
    "                   ( 5, 200.0, 5000.2)], #a new one \n",
    "           columns=('DNI', 'Weight', 'Salary')\n",
    "                 )\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>ID</th>\n",
       "      <th>Country</th>\n",
       "      <th>DNI</th>\n",
       "      <th>Weight</th>\n",
       "      <th>Salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1</td>\n",
       "      <td>Italy</td>\n",
       "      <td>1</td>\n",
       "      <td>145.0</td>\n",
       "      <td>3000.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>129.0</td>\n",
       "      <td>3000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>159.0</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>109.0</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4</td>\n",
       "      <td>Italy</td>\n",
       "      <td>4</td>\n",
       "      <td>198.1</td>\n",
       "      <td>4020.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      name  ID  Country  DNI  Weight  Salary\n",
       "0     Jhon   1    Italy    1   145.0  3000.1\n",
       "1  William   3  Finland    3   129.0  3000.0\n",
       "2  William   3  Finland    3   159.0  4000.0\n",
       "3  William   3  Finland    3   109.0  5000.0\n",
       "4    Snake   4    Italy    4   198.1  4020.2"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1.merge(df2, left_on='ID', right_on='DNI')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>ID</th>\n",
       "      <th>Country</th>\n",
       "      <th>DNI</th>\n",
       "      <th>Weight</th>\n",
       "      <th>Salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1</td>\n",
       "      <td>Italy</td>\n",
       "      <td>1.0</td>\n",
       "      <td>145.0</td>\n",
       "      <td>3000.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Pep</td>\n",
       "      <td>2</td>\n",
       "      <td>Germany</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3.0</td>\n",
       "      <td>129.0</td>\n",
       "      <td>3000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3.0</td>\n",
       "      <td>159.0</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3.0</td>\n",
       "      <td>109.0</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4</td>\n",
       "      <td>Italy</td>\n",
       "      <td>4.0</td>\n",
       "      <td>198.1</td>\n",
       "      <td>4020.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      name  ID  Country  DNI  Weight  Salary\n",
       "0     Jhon   1    Italy  1.0   145.0  3000.1\n",
       "1      Pep   2  Germany  NaN     NaN     NaN\n",
       "2  William   3  Finland  3.0   129.0  3000.0\n",
       "3  William   3  Finland  3.0   159.0  4000.0\n",
       "4  William   3  Finland  3.0   109.0  5000.0\n",
       "5    Snake   4    Italy  4.0   198.1  4020.2"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1.merge(df2, left_on='ID', right_on='DNI',how=\"left\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>ID</th>\n",
       "      <th>Country</th>\n",
       "      <th>DNI</th>\n",
       "      <th>Weight</th>\n",
       "      <th>Salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1.0</td>\n",
       "      <td>Italy</td>\n",
       "      <td>1</td>\n",
       "      <td>145.0</td>\n",
       "      <td>3000.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>William</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>129.0</td>\n",
       "      <td>3000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>William</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>159.0</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>William</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>109.0</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Italy</td>\n",
       "      <td>4</td>\n",
       "      <td>198.1</td>\n",
       "      <td>4020.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5</td>\n",
       "      <td>200.0</td>\n",
       "      <td>5000.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      name   ID  Country  DNI  Weight  Salary\n",
       "0     Jhon  1.0    Italy    1   145.0  3000.1\n",
       "1  William  3.0  Finland    3   129.0  3000.0\n",
       "2  William  3.0  Finland    3   159.0  4000.0\n",
       "3  William  3.0  Finland    3   109.0  5000.0\n",
       "4    Snake  4.0    Italy    4   198.1  4020.2\n",
       "5      NaN  NaN      NaN    5   200.0  5000.2"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1.merge(df2, left_on='ID', right_on='DNI',how=\"right\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>ID</th>\n",
       "      <th>Country</th>\n",
       "      <th>DNI</th>\n",
       "      <th>Weight</th>\n",
       "      <th>Salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1</td>\n",
       "      <td>Italy</td>\n",
       "      <td>1</td>\n",
       "      <td>145.0</td>\n",
       "      <td>3000.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>129.0</td>\n",
       "      <td>3000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>159.0</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>109.0</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4</td>\n",
       "      <td>Italy</td>\n",
       "      <td>4</td>\n",
       "      <td>198.1</td>\n",
       "      <td>4020.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      name  ID  Country  DNI  Weight  Salary\n",
       "0     Jhon   1    Italy    1   145.0  3000.1\n",
       "1  William   3  Finland    3   129.0  3000.0\n",
       "2  William   3  Finland    3   159.0  4000.0\n",
       "3  William   3  Finland    3   109.0  5000.0\n",
       "4    Snake   4    Italy    4   198.1  4020.2"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1.merge(df2, left_on='ID', right_on='DNI',how=\"inner\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>ID</th>\n",
       "      <th>Country</th>\n",
       "      <th>DNI</th>\n",
       "      <th>Weight</th>\n",
       "      <th>Salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1</td>\n",
       "      <td>Italy</td>\n",
       "      <td>1</td>\n",
       "      <td>145.0</td>\n",
       "      <td>3000.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1</td>\n",
       "      <td>Italy</td>\n",
       "      <td>3</td>\n",
       "      <td>129.0</td>\n",
       "      <td>3000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1</td>\n",
       "      <td>Italy</td>\n",
       "      <td>3</td>\n",
       "      <td>159.0</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1</td>\n",
       "      <td>Italy</td>\n",
       "      <td>3</td>\n",
       "      <td>109.0</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1</td>\n",
       "      <td>Italy</td>\n",
       "      <td>4</td>\n",
       "      <td>198.1</td>\n",
       "      <td>4020.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Jhon</td>\n",
       "      <td>1</td>\n",
       "      <td>Italy</td>\n",
       "      <td>5</td>\n",
       "      <td>200.0</td>\n",
       "      <td>5000.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Pep</td>\n",
       "      <td>2</td>\n",
       "      <td>Germany</td>\n",
       "      <td>1</td>\n",
       "      <td>145.0</td>\n",
       "      <td>3000.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Pep</td>\n",
       "      <td>2</td>\n",
       "      <td>Germany</td>\n",
       "      <td>3</td>\n",
       "      <td>129.0</td>\n",
       "      <td>3000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Pep</td>\n",
       "      <td>2</td>\n",
       "      <td>Germany</td>\n",
       "      <td>3</td>\n",
       "      <td>159.0</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Pep</td>\n",
       "      <td>2</td>\n",
       "      <td>Germany</td>\n",
       "      <td>3</td>\n",
       "      <td>109.0</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Pep</td>\n",
       "      <td>2</td>\n",
       "      <td>Germany</td>\n",
       "      <td>4</td>\n",
       "      <td>198.1</td>\n",
       "      <td>4020.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Pep</td>\n",
       "      <td>2</td>\n",
       "      <td>Germany</td>\n",
       "      <td>5</td>\n",
       "      <td>200.0</td>\n",
       "      <td>5000.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>1</td>\n",
       "      <td>145.0</td>\n",
       "      <td>3000.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>129.0</td>\n",
       "      <td>3000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>159.0</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>3</td>\n",
       "      <td>109.0</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>4</td>\n",
       "      <td>198.1</td>\n",
       "      <td>4020.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>William</td>\n",
       "      <td>3</td>\n",
       "      <td>Finland</td>\n",
       "      <td>5</td>\n",
       "      <td>200.0</td>\n",
       "      <td>5000.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4</td>\n",
       "      <td>Italy</td>\n",
       "      <td>1</td>\n",
       "      <td>145.0</td>\n",
       "      <td>3000.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4</td>\n",
       "      <td>Italy</td>\n",
       "      <td>3</td>\n",
       "      <td>129.0</td>\n",
       "      <td>3000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4</td>\n",
       "      <td>Italy</td>\n",
       "      <td>3</td>\n",
       "      <td>159.0</td>\n",
       "      <td>4000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4</td>\n",
       "      <td>Italy</td>\n",
       "      <td>3</td>\n",
       "      <td>109.0</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4</td>\n",
       "      <td>Italy</td>\n",
       "      <td>4</td>\n",
       "      <td>198.1</td>\n",
       "      <td>4020.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>Snake</td>\n",
       "      <td>4</td>\n",
       "      <td>Italy</td>\n",
       "      <td>5</td>\n",
       "      <td>200.0</td>\n",
       "      <td>5000.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       name  ID  Country  DNI  Weight  Salary\n",
       "0      Jhon   1    Italy    1   145.0  3000.1\n",
       "1      Jhon   1    Italy    3   129.0  3000.0\n",
       "2      Jhon   1    Italy    3   159.0  4000.0\n",
       "3      Jhon   1    Italy    3   109.0  5000.0\n",
       "4      Jhon   1    Italy    4   198.1  4020.2\n",
       "5      Jhon   1    Italy    5   200.0  5000.2\n",
       "6       Pep   2  Germany    1   145.0  3000.1\n",
       "7       Pep   2  Germany    3   129.0  3000.0\n",
       "8       Pep   2  Germany    3   159.0  4000.0\n",
       "9       Pep   2  Germany    3   109.0  5000.0\n",
       "10      Pep   2  Germany    4   198.1  4020.2\n",
       "11      Pep   2  Germany    5   200.0  5000.2\n",
       "12  William   3  Finland    1   145.0  3000.1\n",
       "13  William   3  Finland    3   129.0  3000.0\n",
       "14  William   3  Finland    3   159.0  4000.0\n",
       "15  William   3  Finland    3   109.0  5000.0\n",
       "16  William   3  Finland    4   198.1  4020.2\n",
       "17  William   3  Finland    5   200.0  5000.2\n",
       "18    Snake   4    Italy    1   145.0  3000.1\n",
       "19    Snake   4    Italy    3   129.0  3000.0\n",
       "20    Snake   4    Italy    3   159.0  4000.0\n",
       "21    Snake   4    Italy    3   109.0  5000.0\n",
       "22    Snake   4    Italy    4   198.1  4020.2\n",
       "23    Snake   4    Italy    5   200.0  5000.2"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1.merge(df2,how=\"cross\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Ejercicios\n",
    "\n",
    "\n",
    "Usando el fichero who.csv, se pide:<br/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>151.0</td>\n",
       "      <td>28.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>26088.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>...</td>\n",
       "      <td>692.50</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>257.00</td>\n",
       "      <td>231.9</td>\n",
       "      <td>257.00</td>\n",
       "      <td>5740436.0</td>\n",
       "      <td>5.44</td>\n",
       "      <td>22.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>27.0</td>\n",
       "      <td>98.7</td>\n",
       "      <td>6000.0</td>\n",
       "      <td>93.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>3172.0</td>\n",
       "      <td>0.6</td>\n",
       "      <td>...</td>\n",
       "      <td>3499.12</td>\n",
       "      <td>4.790000e+09</td>\n",
       "      <td>78.14</td>\n",
       "      <td>-2.040000e+09</td>\n",
       "      <td>18.47</td>\n",
       "      <td>15.5</td>\n",
       "      <td>18.47</td>\n",
       "      <td>1431793.9</td>\n",
       "      <td>2.21</td>\n",
       "      <td>45.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>6.0</td>\n",
       "      <td>69.9</td>\n",
       "      <td>5940.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>96.0</td>\n",
       "      <td>33351.0</td>\n",
       "      <td>1.5</td>\n",
       "      <td>...</td>\n",
       "      <td>137535.56</td>\n",
       "      <td>6.970000e+10</td>\n",
       "      <td>351.36</td>\n",
       "      <td>4.700000e+09</td>\n",
       "      <td>40.00</td>\n",
       "      <td>31.2</td>\n",
       "      <td>40.00</td>\n",
       "      <td>20800000.0</td>\n",
       "      <td>2.61</td>\n",
       "      <td>63.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Andorra</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>83.0</td>\n",
       "      <td>83.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Angola</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>146.0</td>\n",
       "      <td>67.4</td>\n",
       "      <td>3890.0</td>\n",
       "      <td>49.0</td>\n",
       "      <td>51.0</td>\n",
       "      <td>16557.0</td>\n",
       "      <td>2.8</td>\n",
       "      <td>...</td>\n",
       "      <td>8991.46</td>\n",
       "      <td>1.490000e+10</td>\n",
       "      <td>27.13</td>\n",
       "      <td>9.140000e+09</td>\n",
       "      <td>164.10</td>\n",
       "      <td>242.5</td>\n",
       "      <td>164.10</td>\n",
       "      <td>8578749.0</td>\n",
       "      <td>4.14</td>\n",
       "      <td>53.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "0  Afghanistan          1          1                          151.0   \n",
       "1      Albania          2          2                           27.0   \n",
       "2      Algeria          3          3                            6.0   \n",
       "3      Andorra          4          2                            NaN   \n",
       "4       Angola          5          3                          146.0   \n",
       "\n",
       "   Adult literacy rate (%)  \\\n",
       "0                     28.0   \n",
       "1                     98.7   \n",
       "2                     69.9   \n",
       "3                      NaN   \n",
       "4                     67.4   \n",
       "\n",
       "   Gross national income per capita (PPP international $)  \\\n",
       "0                                                NaN        \n",
       "1                                             6000.0        \n",
       "2                                             5940.0        \n",
       "3                                                NaN        \n",
       "4                                             3890.0        \n",
       "\n",
       "   Net primary school enrolment ratio female (%)  \\\n",
       "0                                            NaN   \n",
       "1                                           93.0   \n",
       "2                                           94.0   \n",
       "3                                           83.0   \n",
       "4                                           49.0   \n",
       "\n",
       "   Net primary school enrolment ratio male (%)  \\\n",
       "0                                          NaN   \n",
       "1                                         94.0   \n",
       "2                                         96.0   \n",
       "3                                         83.0   \n",
       "4                                         51.0   \n",
       "\n",
       "   Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "0                          26088.0                                4.0  ...   \n",
       "1                           3172.0                                0.6  ...   \n",
       "2                          33351.0                                1.5  ...   \n",
       "3                             74.0                                1.0  ...   \n",
       "4                          16557.0                                2.8  ...   \n",
       "\n",
       "   Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "0               692.50           NaN             NaN   \n",
       "1              3499.12  4.790000e+09           78.14   \n",
       "2            137535.56  6.970000e+10          351.36   \n",
       "3                  NaN           NaN             NaN   \n",
       "4              8991.46  1.490000e+10           27.13   \n",
       "\n",
       "   Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "0                               NaN                         257.00   \n",
       "1                     -2.040000e+09                          18.47   \n",
       "2                      4.700000e+09                          40.00   \n",
       "3                               NaN                            NaN   \n",
       "4                      9.140000e+09                         164.10   \n",
       "\n",
       "   Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "0                           231.9                     257.00   \n",
       "1                            15.5                      18.47   \n",
       "2                            31.2                      40.00   \n",
       "3                             NaN                        NaN   \n",
       "4                           242.5                     164.10   \n",
       "\n",
       "   Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "0         5740436.0                     5.44                           22.9  \n",
       "1         1431793.9                     2.21                           45.4  \n",
       "2        20800000.0                     2.61                           63.3  \n",
       "3               NaN                      NaN                            NaN  \n",
       "4         8578749.0                     4.14                           53.3  \n",
       "\n",
       "[5 rows x 358 columns]"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# read file csv\n",
    "import pandas as pd\n",
    "df_who = pd.read_csv(\"data/WHO.csv\")\n",
    "df_who.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "\n",
    "**1) ¿Cuál és la media de la población urbana (\"Urban_population\") de todos los países? ¿Su desviación típica (std)?**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "16657626.767446807"
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who[\"Urban_population\"].mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "50948665.823935635"
      ]
     },
     "execution_count": 65,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who[\"Urban_population\"].std()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "**2) Consulta la fila del país: “Spain”**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_who[df_who[\"Country\"]==\"Spain\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "**3a) ¿Qué país tiene una mayor población urbana?**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>China</td>\n",
       "      <td>37</td>\n",
       "      <td>7</td>\n",
       "      <td>3.0</td>\n",
       "      <td>90.9</td>\n",
       "      <td>4660.0</td>\n",
       "      <td>96.0</td>\n",
       "      <td>100.0</td>\n",
       "      <td>1328474.0</td>\n",
       "      <td>0.6</td>\n",
       "      <td>...</td>\n",
       "      <td>5547757.5</td>\n",
       "      <td>1.890000e+12</td>\n",
       "      <td>295.23</td>\n",
       "      <td>1.250000e+11</td>\n",
       "      <td>27.3</td>\n",
       "      <td>27.8</td>\n",
       "      <td>27.3</td>\n",
       "      <td>527000000.0</td>\n",
       "      <td>2.95</td>\n",
       "      <td>40.4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "36   China         37          7                            3.0   \n",
       "\n",
       "    Adult literacy rate (%)  \\\n",
       "36                     90.9   \n",
       "\n",
       "    Gross national income per capita (PPP international $)  \\\n",
       "36                                             4660.0        \n",
       "\n",
       "    Net primary school enrolment ratio female (%)  \\\n",
       "36                                           96.0   \n",
       "\n",
       "    Net primary school enrolment ratio male (%)  \\\n",
       "36                                        100.0   \n",
       "\n",
       "    Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "36                        1328474.0                                0.6  ...   \n",
       "\n",
       "    Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "36            5547757.5  1.890000e+12          295.23   \n",
       "\n",
       "    Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "36                      1.250000e+11                           27.3   \n",
       "\n",
       "    Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "36                            27.8                       27.3   \n",
       "\n",
       "    Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "36       527000000.0                     2.95                           40.4  \n",
       "\n",
       "[1 rows x 358 columns]"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who[df_who[\"Urban_population\"] == df_who[\"Urban_population\"].max()]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "**3b) ¿Qué paises tienen una población urbana menor a 50000 ?**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Antigua and Barbuda</td>\n",
       "      <td>6</td>\n",
       "      <td>4</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>15130.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>84.0</td>\n",
       "      <td>1.3</td>\n",
       "      <td>...</td>\n",
       "      <td>421.36</td>\n",
       "      <td>830000000.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-102000000.0</td>\n",
       "      <td>12.60</td>\n",
       "      <td>NaN</td>\n",
       "      <td>12.60</td>\n",
       "      <td>32468.25</td>\n",
       "      <td>2.25</td>\n",
       "      <td>39.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>69</th>\n",
       "      <td>Grenada</td>\n",
       "      <td>70</td>\n",
       "      <td>5</td>\n",
       "      <td>53.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>8770.0</td>\n",
       "      <td>83.0</td>\n",
       "      <td>84.0</td>\n",
       "      <td>106.0</td>\n",
       "      <td>0.3</td>\n",
       "      <td>...</td>\n",
       "      <td>234.50</td>\n",
       "      <td>414000000.0</td>\n",
       "      <td>23.19</td>\n",
       "      <td>-220000000.0</td>\n",
       "      <td>22.00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>22.00</td>\n",
       "      <td>32589.00</td>\n",
       "      <td>0.45</td>\n",
       "      <td>30.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>91</th>\n",
       "      <td>Kiribati</td>\n",
       "      <td>92</td>\n",
       "      <td>6</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6230.0</td>\n",
       "      <td>98.0</td>\n",
       "      <td>96.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>1.7</td>\n",
       "      <td>...</td>\n",
       "      <td>25.65</td>\n",
       "      <td>51900000.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-20800000.0</td>\n",
       "      <td>66.00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>66.00</td>\n",
       "      <td>46926.00</td>\n",
       "      <td>3.08</td>\n",
       "      <td>47.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>151</th>\n",
       "      <td>Saint Kitts and Nevis</td>\n",
       "      <td>152</td>\n",
       "      <td>5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>12440.0</td>\n",
       "      <td>78.0</td>\n",
       "      <td>64.0</td>\n",
       "      <td>50.0</td>\n",
       "      <td>1.3</td>\n",
       "      <td>...</td>\n",
       "      <td>135.57</td>\n",
       "      <td>387000000.0</td>\n",
       "      <td>24.28</td>\n",
       "      <td>-84300000.0</td>\n",
       "      <td>21.00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>21.00</td>\n",
       "      <td>15456.00</td>\n",
       "      <td>1.77</td>\n",
       "      <td>32.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>152</th>\n",
       "      <td>Saint Lucia</td>\n",
       "      <td>153</td>\n",
       "      <td>5</td>\n",
       "      <td>51.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>8500.0</td>\n",
       "      <td>97.0</td>\n",
       "      <td>99.0</td>\n",
       "      <td>163.0</td>\n",
       "      <td>1.1</td>\n",
       "      <td>...</td>\n",
       "      <td>370.06</td>\n",
       "      <td>764000000.0</td>\n",
       "      <td>27.39</td>\n",
       "      <td>-119000000.0</td>\n",
       "      <td>17.70</td>\n",
       "      <td>14.0</td>\n",
       "      <td>17.70</td>\n",
       "      <td>45482.32</td>\n",
       "      <td>1.15</td>\n",
       "      <td>27.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>154</th>\n",
       "      <td>Samoa</td>\n",
       "      <td>155</td>\n",
       "      <td>6</td>\n",
       "      <td>45.0</td>\n",
       "      <td>98.6</td>\n",
       "      <td>5090.0</td>\n",
       "      <td>91.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>185.0</td>\n",
       "      <td>0.8</td>\n",
       "      <td>...</td>\n",
       "      <td>150.22</td>\n",
       "      <td>286000000.0</td>\n",
       "      <td>12.46</td>\n",
       "      <td>-116000000.0</td>\n",
       "      <td>29.98</td>\n",
       "      <td>NaN</td>\n",
       "      <td>29.98</td>\n",
       "      <td>41181.28</td>\n",
       "      <td>1.18</td>\n",
       "      <td>22.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>160</th>\n",
       "      <td>Seychelles</td>\n",
       "      <td>161</td>\n",
       "      <td>3</td>\n",
       "      <td>NaN</td>\n",
       "      <td>91.8</td>\n",
       "      <td>14360.0</td>\n",
       "      <td>100.0</td>\n",
       "      <td>99.0</td>\n",
       "      <td>86.0</td>\n",
       "      <td>0.7</td>\n",
       "      <td>...</td>\n",
       "      <td>578.91</td>\n",
       "      <td>563000000.0</td>\n",
       "      <td>8.34</td>\n",
       "      <td>-187000000.0</td>\n",
       "      <td>13.54</td>\n",
       "      <td>NaN</td>\n",
       "      <td>13.54</td>\n",
       "      <td>43854.10</td>\n",
       "      <td>1.20</td>\n",
       "      <td>52.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>182</th>\n",
       "      <td>Tonga</td>\n",
       "      <td>183</td>\n",
       "      <td>6</td>\n",
       "      <td>17.0</td>\n",
       "      <td>98.9</td>\n",
       "      <td>5470.0</td>\n",
       "      <td>94.0</td>\n",
       "      <td>97.0</td>\n",
       "      <td>100.0</td>\n",
       "      <td>0.5</td>\n",
       "      <td>...</td>\n",
       "      <td>117.25</td>\n",
       "      <td>167000000.0</td>\n",
       "      <td>57.30</td>\n",
       "      <td>-94600000.0</td>\n",
       "      <td>24.48</td>\n",
       "      <td>NaN</td>\n",
       "      <td>24.48</td>\n",
       "      <td>23846.64</td>\n",
       "      <td>1.04</td>\n",
       "      <td>24.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>8 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                   Country  CountryID  Continent  \\\n",
       "5      Antigua and Barbuda          6          4   \n",
       "69                 Grenada         70          5   \n",
       "91                Kiribati         92          6   \n",
       "151  Saint Kitts and Nevis        152          5   \n",
       "152            Saint Lucia        153          5   \n",
       "154                  Samoa        155          6   \n",
       "160             Seychelles        161          3   \n",
       "182                  Tonga        183          6   \n",
       "\n",
       "     Adolescent fertility rate (%)  Adult literacy rate (%)  \\\n",
       "5                              NaN                      NaN   \n",
       "69                            53.0                      NaN   \n",
       "91                             NaN                      NaN   \n",
       "151                            NaN                      NaN   \n",
       "152                           51.0                      NaN   \n",
       "154                           45.0                     98.6   \n",
       "160                            NaN                     91.8   \n",
       "182                           17.0                     98.9   \n",
       "\n",
       "     Gross national income per capita (PPP international $)  \\\n",
       "5                                              15130.0        \n",
       "69                                              8770.0        \n",
       "91                                              6230.0        \n",
       "151                                            12440.0        \n",
       "152                                             8500.0        \n",
       "154                                             5090.0        \n",
       "160                                            14360.0        \n",
       "182                                             5470.0        \n",
       "\n",
       "     Net primary school enrolment ratio female (%)  \\\n",
       "5                                              NaN   \n",
       "69                                            83.0   \n",
       "91                                            98.0   \n",
       "151                                           78.0   \n",
       "152                                           97.0   \n",
       "154                                           91.0   \n",
       "160                                          100.0   \n",
       "182                                           94.0   \n",
       "\n",
       "     Net primary school enrolment ratio male (%)  \\\n",
       "5                                            NaN   \n",
       "69                                          84.0   \n",
       "91                                          96.0   \n",
       "151                                         64.0   \n",
       "152                                         99.0   \n",
       "154                                         90.0   \n",
       "160                                         99.0   \n",
       "182                                         97.0   \n",
       "\n",
       "     Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "5                               84.0                                1.3  ...   \n",
       "69                             106.0                                0.3  ...   \n",
       "91                              94.0                                1.7  ...   \n",
       "151                             50.0                                1.3  ...   \n",
       "152                            163.0                                1.1  ...   \n",
       "154                            185.0                                0.8  ...   \n",
       "160                             86.0                                0.7  ...   \n",
       "182                            100.0                                0.5  ...   \n",
       "\n",
       "     Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "5                 421.36   830000000.0             NaN   \n",
       "69                234.50   414000000.0           23.19   \n",
       "91                 25.65    51900000.0             NaN   \n",
       "151               135.57   387000000.0           24.28   \n",
       "152               370.06   764000000.0           27.39   \n",
       "154               150.22   286000000.0           12.46   \n",
       "160               578.91   563000000.0            8.34   \n",
       "182               117.25   167000000.0           57.30   \n",
       "\n",
       "     Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "5                        -102000000.0                          12.60   \n",
       "69                       -220000000.0                          22.00   \n",
       "91                        -20800000.0                          66.00   \n",
       "151                       -84300000.0                          21.00   \n",
       "152                      -119000000.0                          17.70   \n",
       "154                      -116000000.0                          29.98   \n",
       "160                      -187000000.0                          13.54   \n",
       "182                       -94600000.0                          24.48   \n",
       "\n",
       "     Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "5                               NaN                      12.60   \n",
       "69                              NaN                      22.00   \n",
       "91                              NaN                      66.00   \n",
       "151                             NaN                      21.00   \n",
       "152                            14.0                      17.70   \n",
       "154                             NaN                      29.98   \n",
       "160                             NaN                      13.54   \n",
       "182                             NaN                      24.48   \n",
       "\n",
       "     Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "5            32468.25                     2.25                           39.1  \n",
       "69           32589.00                     0.45                           30.6  \n",
       "91           46926.00                     3.08                           47.4  \n",
       "151          15456.00                     1.77                           32.2  \n",
       "152          45482.32                     1.15                           27.6  \n",
       "154          41181.28                     1.18                           22.4  \n",
       "160          43854.10                     1.20                           52.9  \n",
       "182          23846.64                     1.04                           24.0  \n",
       "\n",
       "[8 rows x 358 columns]"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_who[df_who[\"Urban_population\"] < 50000    ]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "**4) ¿El continente donde está situado Spain es el mismo que el de `United States of America?**\n",
    "\n",
    "Utiliza una condición para obtener un resultado Booleano (*True* o *False*)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>168</th>\n",
       "      <td>Spain</td>\n",
       "      <td>169</td>\n",
       "      <td>2</td>\n",
       "      <td>10.0</td>\n",
       "      <td>97.2</td>\n",
       "      <td>28200.0</td>\n",
       "      <td>99.0</td>\n",
       "      <td>100.0</td>\n",
       "      <td>43887.0</td>\n",
       "      <td>1.1</td>\n",
       "      <td>...</td>\n",
       "      <td>343701.53</td>\n",
       "      <td>6.780000e+11</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-5.770000e+10</td>\n",
       "      <td>4.9</td>\n",
       "      <td>4.2</td>\n",
       "      <td>4.9</td>\n",
       "      <td>33300000.0</td>\n",
       "      <td>1.75</td>\n",
       "      <td>76.7</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    Country  CountryID  Continent  Adolescent fertility rate (%)  \\\n",
       "168   Spain        169          2                           10.0   \n",
       "\n",
       "     Adult literacy rate (%)  \\\n",
       "168                     97.2   \n",
       "\n",
       "     Gross national income per capita (PPP international $)  \\\n",
       "168                                            28200.0        \n",
       "\n",
       "     Net primary school enrolment ratio female (%)  \\\n",
       "168                                           99.0   \n",
       "\n",
       "     Net primary school enrolment ratio male (%)  \\\n",
       "168                                        100.0   \n",
       "\n",
       "     Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "168                          43887.0                                1.1  ...   \n",
       "\n",
       "     Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "168            343701.53  6.780000e+11             NaN   \n",
       "\n",
       "     Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "168                     -5.770000e+10                            4.9   \n",
       "\n",
       "     Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "168                             4.2                        4.9   \n",
       "\n",
       "     Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "168        33300000.0                     1.75                           76.7  \n",
       "\n",
       "[1 rows x 358 columns]"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sample_spain = df_who[df_who[\"Country\"]==\"Spain\"]\n",
    "sample_spain"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>CountryID</th>\n",
       "      <th>Continent</th>\n",
       "      <th>Adolescent fertility rate (%)</th>\n",
       "      <th>Adult literacy rate (%)</th>\n",
       "      <th>Gross national income per capita (PPP international $)</th>\n",
       "      <th>Net primary school enrolment ratio female (%)</th>\n",
       "      <th>Net primary school enrolment ratio male (%)</th>\n",
       "      <th>Population (in thousands) total</th>\n",
       "      <th>Population annual growth rate (%)</th>\n",
       "      <th>...</th>\n",
       "      <th>Total_CO2_emissions</th>\n",
       "      <th>Total_income</th>\n",
       "      <th>Total_reserves</th>\n",
       "      <th>Trade_balance_goods_and_services</th>\n",
       "      <th>Under_five_mortality_from_CME</th>\n",
       "      <th>Under_five_mortality_from_IHME</th>\n",
       "      <th>Under_five_mortality_rate</th>\n",
       "      <th>Urban_population</th>\n",
       "      <th>Urban_population_growth</th>\n",
       "      <th>Urban_population_pct_of_total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>192</th>\n",
       "      <td>United States of America</td>\n",
       "      <td>193</td>\n",
       "      <td>4</td>\n",
       "      <td>43.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>44070.0</td>\n",
       "      <td>93.0</td>\n",
       "      <td>91.0</td>\n",
       "      <td>302841.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>5776431.5</td>\n",
       "      <td>1.100000e+13</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-7.140000e+11</td>\n",
       "      <td>8.0</td>\n",
       "      <td>7.1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>240000000.0</td>\n",
       "      <td>1.39</td>\n",
       "      <td>80.8</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1 rows × 358 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                      Country  CountryID  Continent  \\\n",
       "192  United States of America        193          4   \n",
       "\n",
       "     Adolescent fertility rate (%)  Adult literacy rate (%)  \\\n",
       "192                           43.0                      NaN   \n",
       "\n",
       "     Gross national income per capita (PPP international $)  \\\n",
       "192                                            44070.0        \n",
       "\n",
       "     Net primary school enrolment ratio female (%)  \\\n",
       "192                                           93.0   \n",
       "\n",
       "     Net primary school enrolment ratio male (%)  \\\n",
       "192                                         91.0   \n",
       "\n",
       "     Population (in thousands) total  Population annual growth rate (%)  ...  \\\n",
       "192                         302841.0                                1.0  ...   \n",
       "\n",
       "     Total_CO2_emissions  Total_income  Total_reserves  \\\n",
       "192            5776431.5  1.100000e+13             NaN   \n",
       "\n",
       "     Trade_balance_goods_and_services  Under_five_mortality_from_CME  \\\n",
       "192                     -7.140000e+11                            8.0   \n",
       "\n",
       "     Under_five_mortality_from_IHME  Under_five_mortality_rate  \\\n",
       "192                             7.1                        8.0   \n",
       "\n",
       "     Urban_population  Urban_population_growth  Urban_population_pct_of_total  \n",
       "192       240000000.0                     1.39                           80.8  \n",
       "\n",
       "[1 rows x 358 columns]"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sample_eeuu = df_who[df_who[\"Country\"]==\"United States of America\"]\n",
    "sample_eeuu"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sample_spain[\"Continent\"].values[0] == sample_eeuu[\"Continent\"].values[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "**5) ¿Cuáles son los cinco paises más contaminantes (\"Total_CO2_emissions\")?**\n",
    "\n",
    "Esta es mi pista para una solución elegante: http://pandas.pydata.org/pandas-docs/version/0.19.2/generated/pandas.DataFrame.sort_values.html"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "**6) Observando algunas muestras del fichero puedes establecer la relación entre el identificador del continente y su nombre?**\n",
    "\n",
    "Es decir, sabemos que Spain está en el continente Europeo y el código del continente es el 2. \n",
    "\n",
    "Existen los códigos de continentes: 1, 2, 3, 4, 5, 6, 7\n",
    "\n",
    "**Nota:** Hay dos códigos asociados a Asia.\n",
    "\n",
    "Haz las consultas pertinentes al dataframe para construir un diccionario con la siguiente estructura:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "codigoContinentes = {1:\"Asia\",2:\"Europa\"} #Al menos hay 7!\n",
    "print(codigoContinentes[2])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "codigoContinentes = {1:\"Asia\",2:\"Europa\",3:\"\", 4:\"\"} # TODO... "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "**7) Una vez identificado el nombre de los continentes, ¿puedes cambiar la columna de identificadores de continentes por sus respectivos nombres?**\n",
    "\n",
    "Esta es es mi pista para una solución elegante: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.map.html"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "**8) Puedes crear un nuevo dataframe con aquellos paises que sean de Europa?**\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "df2 = df #Con una simple asignación ya creas una dataframe \n",
    "type(df2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "1**9) ¿Cuáles son los paises más contaminantes de Europa?**\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**10) Calcula la cantidad de ayuda recibida por cada municipio en función del númeto total de habitantes.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dataframe de Municipios:\n",
      "        Nombre  Código Postal  Población\n",
      "0   Municipio1          60494      39232\n",
      "1   Municipio2          65125      15315\n",
      "2   Municipio3          15306      34075\n",
      "3   Municipio4          43936      10127\n",
      "4   Municipio5          77013      19470\n",
      "5   Municipio6          73691      10158\n",
      "6   Municipio7          63075       7214\n",
      "7   Municipio8          49755      41525\n",
      "8   Municipio9          72468      17417\n",
      "9  Municipio10          56930      35902\n",
      "\n",
      "Dataframe de Ayudas:\n",
      "        Nombre  Ayuda Económica (en euros)  Número de Beneficiarios\n",
      "0   Municipio1                        3407                       88\n",
      "1   Municipio2                        6081                       91\n",
      "2   Municipio3                        2618                       36\n",
      "3   Municipio4                        2208                       80\n",
      "4   Municipio5                        6409                       71\n",
      "5   Municipio6                        8735                       66\n",
      "6   Municipio7                        2649                       76\n",
      "7   Municipio8                        6796                       43\n",
      "8   Municipio9                        8113                       17\n",
      "9  Municipio10                        6180                       80\n"
     ]
    }
   ],
   "source": [
    "import pandas as pd\n",
    "import random\n",
    "\n",
    "random.seed(0)\n",
    "\n",
    "nombres = [f'Municipio{i}' for i in range(1, 11)] \n",
    "\n",
    "data_municipios = {\n",
    "    'Nombre': nombres,\n",
    "    'Código Postal': [random.randint(10000, 99999) for _ in range(10)],\n",
    "    'Población': [random.randint(1000, 50000) for _ in range(10)]  # Añadimos un atributo aleatorio, en este caso \"Población\"\n",
    "}\n",
    "\n",
    "df_municipios = pd.DataFrame(data_municipios)\n",
    "\n",
    "\n",
    "data_ayudas = {\n",
    "    'Nombre':  [f'Municipio{i}' for i in range(1, 11)],\n",
    "    'Ayuda Económica (en euros)': [random.randint(1000, 10000) for _ in range(10)],\n",
    "    'Número de Beneficiarios': [random.randint(10, 100) for _ in range(10)]\n",
    "}\n",
    "\n",
    "df_ayudas = pd.DataFrame(data_ayudas)\n",
    "\n",
    "print(\"Dataframe de Municipios:\")\n",
    "print(df_municipios)\n",
    "\n",
    "print(\"\\nDataframe de Ayudas:\")\n",
    "print(df_ayudas)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Usando el fichero climaMallorca.csv, se pide:<br/>\n",
    "**11) ¿Cual es la temperatura máxima cuando el viento es inferior a 10? ¿Cuántas muestras hay?**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**12) ¿Cual es la temperatura máxima cuando el viento es superior a 10 y inferior a 20? ¿Cuántas muestras hay?**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![License: CC BY 4.0](https://img.shields.io/badge/License-CC_BY_4.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/) <br/>\n",
    "Isaac Lera and Gabriel Moya <br/>\n",
    "Universitat de les Illes Balears <br/>\n",
    "isaac.lera@uib.edu, gabriel.moya@uib.edu"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "my3110",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}