Skip to content

Commit

Permalink
add structural pattern matching
Browse files Browse the repository at this point in the history
  • Loading branch information
khuyentran1401 committed Aug 20, 2024
1 parent 8293d71 commit 21aa9ee
Show file tree
Hide file tree
Showing 7 changed files with 555 additions and 31 deletions.
114 changes: 101 additions & 13 deletions Chapter1/python_new_features.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -122,28 +122,108 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "b7169caa",
"id": "ce7a6d1c",
"metadata": {},
"source": [
"Extracting data from nested structures often leads to complex, error-prone code with multiple checks and conditionals. Consider this traditional approach:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "69ff114d",
"metadata": {},
"outputs": [],
"source": [
"Have you ever wanted to match complex data types and extract their information? \n",
"def get_youngest_pet(pet_info):\n",
" if isinstance(pet_info, list) and len(pet_info) == 2:\n",
" if all(\"age\" in pet for pet in pet_info):\n",
" print(\"Age is extracted from a list\")\n",
" return min(pet_info[0][\"age\"], pet_info[1][\"age\"])\n",
" elif isinstance(pet_info, dict) and \"age\" in pet_info:\n",
" if isinstance(pet_info[\"age\"], dict):\n",
" print(\"Age is extracted from a dict\")\n",
" ages = pet_info[\"age\"].values()\n",
" return min(ages)\n",
"\n",
"Python 3.10 allows you to do exactly that with the `match` statement and the `case` statements. "
" # Handle other cases or raise an exception\n",
" raise ValueError(\"Invalid input format\")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "976c0668",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Age is extracted from a list\n"
]
},
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pet_info1 = [\n",
" {\"name\": \"bim\", \"age\": 1},\n",
" {\"name\": \"pepper\", \"age\": 9},\n",
"]\n",
"get_youngest_pet(pet_info1)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "99225ef8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Age is extracted from a dict\n"
]
},
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pet_info2 = {'age': {\"bim\": 1, \"pepper\": 9}}\n",
"get_youngest_pet(pet_info2)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "42104aba",
"id": "b7169caa",
"metadata": {},
"source": [
"The code below uses structural pattern matching to extract ages from the matching data structure. "
"Python 3.10's pattern matching provides a more declarative and readable way to handle complex data structures, reducing the need for nested conditionals and type checks."
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 22,
"id": "a181f881",
"metadata": {},
"outputs": [],
Expand All @@ -157,12 +237,15 @@
" case {'age': {}}:\n",
" print(\"Age is extracted from a dict\")\n",
" ages = pet_info['age'].values()\n",
" return min(ages)\n"
" return min(ages)\n",
"\n",
" case _:\n",
" raise ValueError(\"Invalid input format\")\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 23,
"id": "9604eb87",
"metadata": {},
"outputs": [
Expand All @@ -179,18 +262,22 @@
"1"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "display_data"
"output_type": "execute_result"
}
],
"source": [
"pet_info1 = [{\"name\": \"bim\", \"age\": 1}, {\"name\": \"pepper\", \"age\": 9}]\n",
"pet_info1 = [\n",
" {\"name\": \"bim\", \"age\": 1},\n",
" {\"name\": \"pepper\", \"age\": 9},\n",
"]\n",
"get_youngest_pet(pet_info1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 8,
"id": "7f8f2b9f",
"metadata": {},
"outputs": [
Expand All @@ -207,8 +294,9 @@
"1"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "display_data"
"output_type": "execute_result"
}
],
"source": [
Expand Down
113 changes: 113 additions & 0 deletions Chapter5/SQL.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1023,6 +1023,119 @@
"source": [
"[Link to sql-metadata](https://github.com/macbre/sql-metadata)."
]
},
{
"cell_type": "markdown",
"id": "055e3474",
"metadata": {},
"source": [
"### SQLGlot: Write Once, Run Anywhere SQL"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8d257860",
"metadata": {
"tags": [
"hide-cell"
]
},
"outputs": [],
"source": [
"!pip install \"sqlglot[rs]\""
]
},
{
"cell_type": "markdown",
"id": "49d7f225",
"metadata": {},
"source": [
"SQL dialects vary across databases, making it challenging to port queries between different database systems.\n",
"\n",
"SQLGlot addresses this by providing a parser and transpiler supporting 21 dialects. This enables automatic SQL translation between systems, eliminating the need for manual query rewrites."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "72644346",
"metadata": {},
"outputs": [],
"source": [
"import sqlglot"
]
},
{
"cell_type": "markdown",
"id": "733ad01f",
"metadata": {},
"source": [
"Convert a DuckDB-specific date formatting query into an equivalent query in Hive SQL:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "8f46784d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"SELECT DATE_FORMAT(x, 'yy-M-ss')\""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sqlglot.transpile(\"SELECT STRFTIME(x, '%y-%-m-%S')\", read=\"duckdb\", write=\"hive\")[0]"
]
},
{
"cell_type": "markdown",
"id": "123e7f07",
"metadata": {},
"source": [
"Convert a SQL query to Spark SQL:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "397512c2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SELECT\n",
" `id`,\n",
" `name`,\n",
" CAST(`price` AS FLOAT) AS `converted_price`\n",
"FROM `products`\n"
]
}
],
"source": [
"# Spark SQL requires backticks (`) for delimited identifiers and uses `FLOAT` over `REAL`\n",
"sql = \"SELECT id, name, CAST(price AS REAL) AS converted_price FROM products\"\n",
"\n",
"# Translates the query into Spark SQL, formats it, and delimits all of its identifiers\n",
"print(sqlglot.transpile(sql, write=\"spark\", identify=True, pretty=True)[0])"
]
},
{
"cell_type": "markdown",
"id": "2a983e21",
"metadata": {},
"source": [
"[Link to SQLGlot](https://bit.ly/4dGyTmP)."
]
}
],
"metadata": {
Expand Down
68 changes: 64 additions & 4 deletions docs/Chapter1/python_new_features.html
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,7 @@
<li class="toctree-l2"><a class="reference internal" href="../Chapter2/dataclasses.html">3.7. Data Classes</a></li>
<li class="toctree-l2"><a class="reference internal" href="../Chapter2/typing.html">3.8. Typing</a></li>
<li class="toctree-l2"><a class="reference internal" href="../Chapter2/pathlib.html">3.9. pathlib</a></li>
<li class="toctree-l2"><a class="reference internal" href="../Chapter2/pydantic.html">3.10. Pydantic</a></li>
</ul>
</li>
<li class="toctree-l1 has-children"><a class="reference internal" href="../Chapter3/Chapter3.html">4. Pandas</a><input class="toctree-checkbox" id="toctree-checkbox-4" name="toctree-checkbox-4" type="checkbox"/><label class="toctree-toggle" for="toctree-checkbox-4"><i class="fa-solid fa-chevron-down"></i></label><ul>
Expand Down Expand Up @@ -587,9 +588,62 @@ <h2><span class="section-number">2.10.1. </span>Simplify Conditional Execution w
</section>
<section id="structural-pattern-matching-in-python-3-10">
<h2><span class="section-number">2.10.2. </span>Structural Pattern Matching in Python 3.10<a class="headerlink" href="#structural-pattern-matching-in-python-3-10" title="Permalink to this heading">#</a></h2>
<p>Have you ever wanted to match complex data types and extract their information?</p>
<p>Python 3.10 allows you to do exactly that with the <code class="docutils literal notranslate"><span class="pre">match</span></code> statement and the <code class="docutils literal notranslate"><span class="pre">case</span></code> statements.</p>
<p>The code below uses structural pattern matching to extract ages from the matching data structure.</p>
<p>Extracting data from nested structures often leads to complex, error-prone code with multiple checks and conditionals. Consider this traditional approach:</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">get_youngest_pet</span><span class="p">(</span><span class="n">pet_info</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">pet_info</span><span class="p">,</span> <span class="nb">list</span><span class="p">)</span> <span class="ow">and</span> <span class="nb">len</span><span class="p">(</span><span class="n">pet_info</span><span class="p">)</span> <span class="o">==</span> <span class="mi">2</span><span class="p">:</span>
<span class="k">if</span> <span class="nb">all</span><span class="p">(</span><span class="s2">&quot;age&quot;</span> <span class="ow">in</span> <span class="n">pet</span> <span class="k">for</span> <span class="n">pet</span> <span class="ow">in</span> <span class="n">pet_info</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Age is extracted from a list&quot;</span><span class="p">)</span>
<span class="k">return</span> <span class="nb">min</span><span class="p">(</span><span class="n">pet_info</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="s2">&quot;age&quot;</span><span class="p">],</span> <span class="n">pet_info</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="s2">&quot;age&quot;</span><span class="p">])</span>
<span class="k">elif</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">pet_info</span><span class="p">,</span> <span class="nb">dict</span><span class="p">)</span> <span class="ow">and</span> <span class="s2">&quot;age&quot;</span> <span class="ow">in</span> <span class="n">pet_info</span><span class="p">:</span>
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">pet_info</span><span class="p">[</span><span class="s2">&quot;age&quot;</span><span class="p">],</span> <span class="nb">dict</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Age is extracted from a dict&quot;</span><span class="p">)</span>
<span class="n">ages</span> <span class="o">=</span> <span class="n">pet_info</span><span class="p">[</span><span class="s2">&quot;age&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">values</span><span class="p">()</span>
<span class="k">return</span> <span class="nb">min</span><span class="p">(</span><span class="n">ages</span><span class="p">)</span>

<span class="c1"># Handle other cases or raise an exception</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">&quot;Invalid input format&quot;</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pet_info1</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;bim&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;pepper&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">9</span><span class="p">},</span>
<span class="p">]</span>
<span class="n">get_youngest_pet</span><span class="p">(</span><span class="n">pet_info1</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>Age is extracted from a list
</pre></div>
</div>
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>1
</pre></div>
</div>
</div>
</div>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pet_info2</span> <span class="o">=</span> <span class="p">{</span><span class="s1">&#39;age&#39;</span><span class="p">:</span> <span class="p">{</span><span class="s2">&quot;bim&quot;</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s2">&quot;pepper&quot;</span><span class="p">:</span> <span class="mi">9</span><span class="p">}}</span>
<span class="n">get_youngest_pet</span><span class="p">(</span><span class="n">pet_info2</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>Age is extracted from a dict
</pre></div>
</div>
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>1
</pre></div>
</div>
</div>
</div>
<p>Python 3.10’s pattern matching provides a more declarative and readable way to handle complex data structures, reducing the need for nested conditionals and type checks.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">get_youngest_pet</span><span class="p">(</span><span class="n">pet_info</span><span class="p">):</span>
Expand All @@ -602,13 +656,19 @@ <h2><span class="section-number">2.10.2. </span>Structural Pattern Matching in P
<span class="nb">print</span><span class="p">(</span><span class="s2">&quot;Age is extracted from a dict&quot;</span><span class="p">)</span>
<span class="n">ages</span> <span class="o">=</span> <span class="n">pet_info</span><span class="p">[</span><span class="s1">&#39;age&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">values</span><span class="p">()</span>
<span class="k">return</span> <span class="nb">min</span><span class="p">(</span><span class="n">ages</span><span class="p">)</span>

<span class="k">case</span><span class="w"> </span><span class="k">_</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">&quot;Invalid input format&quot;</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pet_info1</span> <span class="o">=</span> <span class="p">[{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;bim&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span> <span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;pepper&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">9</span><span class="p">}]</span>
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pet_info1</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;bim&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;pepper&quot;</span><span class="p">,</span> <span class="s2">&quot;age&quot;</span><span class="p">:</span> <span class="mi">9</span><span class="p">},</span>
<span class="p">]</span>
<span class="n">get_youngest_pet</span><span class="p">(</span><span class="n">pet_info1</span><span class="p">)</span>
</pre></div>
</div>
Expand Down
Loading

0 comments on commit 21aa9ee

Please sign in to comment.