"# Using and Debugging Dask (and other Big Data systems)\n",
"\n",
"In this lecture, we're going to deep dive into using Dask and apply what we've learned in previous lectures to understand why the code works the way that it does.\n",
"evalue": "Metadata inference failed in `apply`.\n\nYou have supplied a custom function and Dask is unable to \ndetermine the type of output that that function returns. \n\nTo resolve this please provide a meta= keyword.\nThe docstring of the Dask function you ran should have more information.\n\nOriginal error is below:\n------------------------\nValueError(\"time data 'foo' does not match format '%Y-%m-%d %H:%M:%S'\")\n\nTraceback:\n---------\n File \"/Users/sanjaykrishnan/Documents/cmsc21800/venv/lib/python3.7/site-packages/dask/dataframe/utils.py\", line 175, in raise_on_meta_error\n yield\n File \"/Users/sanjaykrishnan/Documents/cmsc21800/venv/lib/python3.7/site-packages/dask/dataframe/core.py\", line 5513, in _emulate\n return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))\n File \"/Users/sanjaykrishnan/Documents/cmsc21800/venv/lib/python3.7/site-packages/dask/utils.py\", line 900, in __call__\n return getattr(obj, self.method)(*args, **kwargs)\n File \"/Users/sanjaykrishnan/Documents/cmsc21800/venv/lib/python3.7/site-packages/pandas/core/series.py\", line 4045, in apply\n mapped = lib.map_infer(values, f, convert=convert_dtype)\n File \"pandas/_libs/lib.pyx\", line 2228, in pandas._libs.lib.map_infer\n File \"<ipython-input-9-3a0da807679d>\", line 2, in <lambda>\n deltas = df['dropoff_datetime'].apply(lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M:%S').timestamp()) - df['pickup_datetime'].apply(datetime.strptime(x, '%Y-%m-%d %H:%M:%S').timestamp())\n File \"/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_strptime.py\", line 577, in _strptime_datetime\n tt, fraction, gmtoff_fraction = _strptime(data_string, format)\n File \"/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_strptime.py\", line 359, in _strptime\n (data_string, format))\n",
"\u001b[0;32m/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_strptime.py\u001b[0m in \u001b[0;36m_strptime_datetime\u001b[0;34m(cls, data_string, format)\u001b[0m\n\u001b[1;32m 576\u001b[0m format string.\"\"\"\n\u001b[0;32m--> 577\u001b[0;31m \u001b[0mtt\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfraction\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mgmtoff_fraction\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_strptime\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdata_string\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mformat\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 578\u001b[0m \u001b[0mtzname\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mgmtoff\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtt\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_strptime.py\u001b[0m in \u001b[0;36m_strptime\u001b[0;34m(data_string, format)\u001b[0m\n\u001b[1;32m 358\u001b[0m raise ValueError(\"time data %r does not match format %r\" %\n\u001b[0;32m--> 359\u001b[0;31m (data_string, format))\n\u001b[0m\u001b[1;32m 360\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdata_string\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m!=\u001b[0m \u001b[0mfound\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mValueError\u001b[0m: time data 'foo' does not match format '%Y-%m-%d %H:%M:%S'",
"\nThe above exception was the direct cause of the following exception:\n",
"\u001b[0;31mValueError\u001b[0m: Metadata inference failed in `apply`.\n\nYou have supplied a custom function and Dask is unable to \ndetermine the type of output that that function returns. \n\nTo resolve this please provide a meta= keyword.\nThe docstring of the Dask function you ran should have more information.\n\nOriginal error is below:\n------------------------\nValueError(\"time data 'foo' does not match format '%Y-%m-%d %H:%M:%S'\")\n\nTraceback:\n---------\n File \"/Users/sanjaykrishnan/Documents/cmsc21800/venv/lib/python3.7/site-packages/dask/dataframe/utils.py\", line 175, in raise_on_meta_error\n yield\n File \"/Users/sanjaykrishnan/Documents/cmsc21800/venv/lib/python3.7/site-packages/dask/dataframe/core.py\", line 5513, in _emulate\n return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))\n File \"/Users/sanjaykrishnan/Documents/cmsc21800/venv/lib/python3.7/site-packages/dask/utils.py\", line 900, in __call__\n return getattr(obj, self.method)(*args, **kwargs)\n File \"/Users/sanjaykrishnan/Documents/cmsc21800/venv/lib/python3.7/site-packages/pandas/core/series.py\", line 4045, in apply\n mapped = lib.map_infer(values, f, convert=convert_dtype)\n File \"pandas/_libs/lib.pyx\", line 2228, in pandas._libs.lib.map_infer\n File \"<ipython-input-9-3a0da807679d>\", line 2, in <lambda>\n deltas = df['dropoff_datetime'].apply(lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M:%S').timestamp()) - df['pickup_datetime'].apply(datetime.strptime(x, '%Y-%m-%d %H:%M:%S').timestamp())\n File \"/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_strptime.py\", line 577, in _strptime_datetime\n tt, fraction, gmtoff_fraction = _strptime(data_string, format)\n File \"/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_strptime.py\", line 359, in _strptime\n (data_string, format))\n"
"You did not provide metadata, so Dask is running your function on a small dataset to guess output types. It is possible that Dask will guess incorrectly.\n",
"To provide an explicit output types or to silence this message, please provide the `meta=` keyword, as described in the map or apply function that you are using.\n",
"You did not provide metadata, so Dask is running your function on a small dataset to guess output types. It is possible that Dask will guess incorrectly.\n",
"To provide an explicit output types or to silence this message, please provide the `meta=` keyword, as described in the map or apply function that you are using.\n",