<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title>Pandas - Tag - 300.Watts</title><link>https://300watts.me/tags/pandas/</link><description>Pandas - Tag - 300.Watts</description><generator>Hugo -- gohugo.io</generator><language>en</language><managingEditor>morristai01@gmail.com (Morris)</managingEditor><webMaster>morristai01@gmail.com (Morris)</webMaster><copyright>This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.</copyright><lastBuildDate>Sat, 26 Aug 2017 11:16:25 +0800</lastBuildDate><atom:link href="https://300watts.me/tags/pandas/" rel="self" type="application/rss+xml"/><item><title>Pandas ufuncs Tips and Tricks</title><link>https://300watts.me/posts/pandas-ufuncs-tips-and-tricks/</link><pubDate>Sat, 26 Aug 2017 11:16:25 +0800</pubDate><author><name>Morris</name></author><guid>https://300watts.me/posts/pandas-ufuncs-tips-and-tricks/</guid><description><![CDATA[<blockquote>
  <p>Why are pandas <code>ufuncs</code> recommended over <code>apply</code>?</p>

</blockquote><p>Pandas has an <code>apply</code> function that lets you run arbitrary functions across every value in a column. The catch is that <code>apply</code> is only marginally faster than a plain Python loop. That&rsquo;s why pandas&rsquo; built-in <code>ufuncs</code> are the preferred choice for column preprocessing.<br>
<code>ufuncs</code> are special functions built on top of numpy and implemented in <strong>C</strong>, which is why they&rsquo;re so fast. Below we introduce several examples of <code>ufuncs</code>: <code>.diff</code>, <code>.shift</code>, <code>.cumsum</code>, <code>.cumcount</code>, <code>.str</code> commands (for strings), and <code>.dt</code> commands (for dates).</p>]]></description></item><item><title>Pandas cut and qcut Functions</title><link>https://300watts.me/posts/pandas-cut-and-qcut-functions/</link><pubDate>Sat, 05 Aug 2017 11:46:56 +0800</pubDate><author><name>Morris</name></author><guid>https://300watts.me/posts/pandas-cut-and-qcut-functions/</guid><description><![CDATA[<p>When we have continuous numerical values, we can discretize them using <code>cut</code> and <code>qcut</code>.
The <code>cut</code> function bins values by numeric intervals, while <code>qcut</code> bins them by quantiles.
In other words, <code>cut</code> produces bins of equal length, while <code>qcut</code> produces bins of equal size.</p>
<blockquote>
  <h2 id="the-cut-function" class="headerLink">
    <a href="#the-cut-function" class="header-mark"></a>The cut function</h2>
</blockquote><p>Suppose we have the ages of a group of people:<br>
<strong>ages</strong> = <code>[20, 22, 25, 27, 21, 23, 37, 31, 61, 45, 41, 32, 101]</code><br>
If we want to discretize this list into &ldquo;18 to 25&rdquo;, &ldquo;25 to 35&rdquo;, &ldquo;35 to 60&rdquo;, and &ldquo;60 and above&rdquo;, we can use the <code>cut</code> function:</p>]]></description></item></channel></rss>