Skip to content

Commit

Permalink
Site updated: 2023-11-21 15:17:28
Browse files Browse the repository at this point in the history
  • Loading branch information
Kinferiority committed Nov 21, 2023
1 parent e2c6ebf commit a2fc840
Show file tree
Hide file tree
Showing 14 changed files with 831 additions and 5 deletions.
14 changes: 12 additions & 2 deletions 2023/11/07/使用SVM进行垃圾短信分类/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,8 @@ <h2 class="title">使用SVM进行垃圾短信分类</h2>
<div class="divider"></div>

<div class="content">
<h1 id="使用SVM进行垃圾短信分类"><a href="#使用SVM进行垃圾短信分类" class="headerlink" title="使用SVM进行垃圾短信分类"></a>使用SVM进行垃圾短信分类</h1><h2 id="1-数据集"><a href="#1-数据集" class="headerlink" title="1.数据集"></a>1.数据集</h2><p><a target="_blank" rel="noopener" href="https://www.kaggle.com/code/sadeghjalalian/96-accuracy-svm-spam-text-message-classification/comments">来自kaggle</a></p>
<p><a target="_blank" rel="noopener" href="https://kinferiority.github.io/2023/11/07/%E5%85%B3%E4%BA%8E%E4%BD%BF%E7%94%A8SVM%E8%BF%9B%E8%A1%8C%E5%9E%83%E5%9C%BE%E7%9F%AD%E4%BF%A1%E5%88%86%E7%B1%BB%E7%9A%84%E5%9B%9E%E9%A1%BE/">相互知识总结</a></p>
<h1 id="使用SVM进行垃圾短信分类"><a href="#使用SVM进行垃圾短信分类" class="headerlink" title="使用SVM进行垃圾短信分类"></a>使用SVM进行垃圾短信分类</h1><h2 id="1-数据集"><a href="#1-数据集" class="headerlink" title="1.数据集"></a>1.数据集</h2><p><a target="_blank" rel="noopener" href="https://www.kaggle.com/code/sadeghjalalian/96-accuracy-svm-spam-text-message-classification/comments">来自kaggle</a></p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">Category,Message</span><br><span class="line">ham,"Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat..."</span><br><span class="line">ham,Ok lar... Joking wif u oni...</span><br><span class="line">spam,Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)T&amp;C's apply 08452810075over18's</span><br><span class="line">ham,U dun say so early hor... U c already then say...</span><br></pre></td></tr></table></figure>
<h3 id="1-1阅读数据集相关信息"><a href="#1-1阅读数据集相关信息" class="headerlink" title="1.1阅读数据集相关信息"></a>1.1阅读数据集相关信息</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">data = pd.read_csv("SPAM text message 20170820 - Data.csv")</span><br><span class="line">print(data.head())</span><br><span class="line">print(data.info())</span><br><span class="line">print(data.groupby("Category").describe())</span><br><span class="line">'''</span><br><span class="line"> Message </span><br><span class="line"> count unique top freq</span><br><span class="line">Category </span><br><span class="line">ham 4825 4516 Sorry, I'll call later 30</span><br><span class="line">spam 747 641 Please call our customer service representativ... 4</span><br><span class="line">'''</span><br></pre></td></tr></table></figure>
<h3 id="1-2创建饼图"><a href="#1-2创建饼图" class="headerlink" title="1.2创建饼图"></a>1.2创建饼图</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"># 创建新列length统计信息长度</span><br><span class="line">data['Length'] = data['Message'].apply(len)</span><br><span class="line">print(data.head())</span><br><span class="line"></span><br><span class="line"># 展示数据分布百分比</span><br><span class="line">explode = (0.1, 0)</span><br><span class="line"># explode用于指定是否要将某个扇形从饼图中分离出来。</span><br><span class="line"># 在这里,第一个元素是0.1,表示将"ham"类别的扇形分离出来,而第二个元素是0,表示"spam"类别的扇形不分离。</span><br><span class="line">fig1, ax1 = plt.subplots(figsize=(12, 7))</span><br><span class="line"># plt.subplots 通常与 ax 对象一起使用,ax 对象是子图的引用,你可以使用它来绘制图形元素。</span><br><span class="line">ax1.pie(data['Category'].value_counts(), explode=explode, labels=['ham', 'spam'], autopct='%1.1f%%',</span><br><span class="line"> shadow=True)</span><br><span class="line">"""</span><br><span class="line">data['Category'].value_counts():返回Pandas Series 对象,这个 Series 对象是一维数据结构,其中索引是类别名称,数据是每个类别出现的次数。</span><br><span class="line">这部分计算了 "Category" 列中各个类别的计数,即 "ham" 和 "spam" 的数量。</span><br><span class="line">explode=explode:使用之前定义的explode变量来分离"ham"类别的扇形。</span><br><span class="line">labels=['ham', 'spam']:设置饼图中扇形的标签,分别为 "ham" 和 "spam"。</span><br><span class="line">autopct='%1.1f%%':设置扇形上显示的百分比格式,保留一位小数。</span><br><span class="line">shadow=True:添加阴影效果,使饼图看起来有立体感</span><br><span class="line"></span><br><span class="line">"""</span><br><span class="line">ax1.axis('equal')</span><br><span class="line">plt.tight_layout()</span><br><span class="line">plt.legend()</span><br><span class="line">plt.show()</span><br></pre></td></tr></table></figure>
Expand Down Expand Up @@ -205,7 +206,16 @@ <h1>About this Post</h1>


<div class="container post-prev-next">
<a class="next"></a>

<a href="/2023/11/21/Word2Vec/" class="next">
<div>
<div class="text">
<p class="label">Next</p>
<h3 class="title">Word2Vec</h3>
</div>
</div>
</a>


<a href="/2023/11/07/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E6%A1%86%E6%9E%B6/" class="prev">
<div>
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 2023/11/21/Word2Vec/image-20231121150924367.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit a2fc840

Please sign in to comment.