-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpython-multiprocesss.html
executable file
·321 lines (270 loc) · 19.2 KB
/
python-multiprocesss.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
<!DOCTYPE html>
<html lang="zh-cn">
<head>
<meta charset="utf-8">
<title>kizzy的个人博客 - python的多线程,多进程以及协程。</title>
<meta name="description" content="">
<meta name="author" content="kizzy">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<!-- Le HTML5 shim, for IE6-8 support of HTML elements -->
<!--[if lt IE 9]>
<script src="https://kuikui.tech/theme/html5.js"></script>
<![endif]-->
<!-- Le styles -->
<link href="https://kuikui.tech/theme/bootstrap.min.css" rel="stylesheet">
<link href="https://kuikui.tech/theme/bootstrap.min.responsive.css" rel="stylesheet">
<link href="https://kuikui.tech/theme/local.css" rel="stylesheet">
<link href="https://kuikui.tech/theme/pygments.css" rel="stylesheet">
<!-- So Firefox can bookmark->"abo this site" -->
<link href="https://kuikui.tech/feeds/all.atom.xml" rel="alternate" title="kizzy的个人博客" type="application/atom+xml">
</head>
<body>
<div class="navbar">
<div class="navbar-inner">
<div class="container">
<a class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</a>
<a class="brand" href="https://kuikui.tech">kizzy的个人博客</a>
<div class="nav-collapse">
<ul class="nav">
</ul>
</div>
</div>
</div>
</div>
<div class="container">
<div class="content">
<div class="row">
<div class="span9">
<div class='article'>
<div class="content-title">
<h1>python的多线程,多进程以及协程。</h1>
Sun 26 July 2020
by <a class="url fn" href="https://kuikui.tech/author/kizzy.html">kizzy</a>
</div>
<div><p> <strong>什么是进程:</strong></p>
<blockquote>
<p>进程(Process)是计算机中的程序关于某数据集合上的一次运行活动,是系统进行资源分配和调度的基本单位,是<a href="https://baike.baidu.com/item/操作系统">操作系统</a>结构的基础。在早期面向进程设计的计算机结构中,进程是程序的基本执行实体;在当代面向线程设计的计算机结构中,进程是线程的容器。程序是指令、数据及其组织形式的描述,进程是程序的实体。</p>
</blockquote>
<p> <strong>什么是多进程:</strong></p>
<blockquote>
<p>python中的多线程其实并不是真正的多线程,如果想要充分地使用多核CPU的资源,在python中大部分情况需要使用多进程。Python提供了非常好用的多进程包multiprocessing,只需要定义一个函数,Python会完成其他所有事情。借助这个包,可以轻松完成从单进程到<strong>并发执行</strong>的转换。multiprocessing支持子进程、通信和共享数据、执行不同形式的同步,提供了Process、Queue、Pipe、Lock等组件。</p>
</blockquote>
<p> <strong>什么是线程:</strong></p>
<blockquote>
<p><strong>线程</strong>(英语:thread)是<a href="https://baike.baidu.com/item/操作系统">操作系统</a>能够进行运算<a href="https://baike.baidu.com/item/调度">调度</a>的最小单位。它被包含在<a href="https://baike.baidu.com/item/进程">进程</a>之中,是<a href="https://baike.baidu.com/item/进程">进程</a>中的实际运作单位。一条线程指的是<a href="https://baike.baidu.com/item/进程">进程</a>中一个单一顺序的控制流,一个进程中可以并发多个线程,每条线程并行执行不同的任务。在Unix System V及<a href="https://baike.baidu.com/item/SunOS">SunOS</a>中也被称为轻量进程(lightweight processes),但轻量进程更多指内核线程(kernel thread),而把用户线程(user thread)称为线程。</p>
</blockquote>
<p> <strong>什么是多线程:</strong></p>
<blockquote>
<p> 多线程类似于同时执行多个不同程序,多线程运行有如下优点:</p>
<ul>
<li>
<p>使用线程可以把占据长时间的程序中的任务放到后台去处理。</p>
</li>
<li>
<p>用户界面可以更加吸引人,这样比如用户点击了一个按钮去触发某些事件的处理,可以弹出一个进度条来显示处理的进度</p>
</li>
<li>
<p>程序的运行速度可能加快</p>
</li>
<li>
<p>在一些等待的任务实现上如用户输入、文件读写和网络收发数据等,线程就比较有用了。在这种情况下我们可以释放一些珍贵的资源如内存占用等等。</p>
</li>
</ul>
<p><strong>线程在执行过程中与进程还是有区别的。每个独立的进程有一个程序运行的入口、顺序执行序列和程序的出口。但是线程不能够独立执行,必须依存在应用程序中,由应用程序提供多个线程执行控制。</strong></p>
<p>每个线程都有他自己的一组CPU寄存器,称为线程的上下文,该上下文反映了线程上次运行该线程的CPU寄存器的状态。</p>
<p>指令指针和堆栈指针寄存器是线程上下文中两个最重要的寄存器,线程总是在进程得到上下文中运行的,这些地址都用于标志拥有线程的进程地址空间中的内存。</p>
<ul>
<li>
<p>线程可以被<strong>抢占(中断)</strong>。</p>
</li>
<li>
<p>在其他线程正在运行时,线程可以暂时搁置(也称为睡眠) -- 这就是线程的退让</p>
</li>
</ul>
</blockquote>
<p> <strong>什么是协程:</strong></p>
<blockquote>
<p>协程与<a href="https://baike.baidu.com/item/子例程/8231923">子例程</a>一样,协程(coroutine)也是一种程序组件。相对子例程而言,协程更为一般和灵活,但在实践中使用没有子例程那样广泛。协程源自 Simula 和 Modula-2 语言,但也有其他语言支持。协程不是进程或线程,其执行过程更类似于子例程,或者说不带返回值的<a href="https://baike.baidu.com/item/函数调用/4127405">函数调用</a>。</p>
</blockquote>
<p> 这些手段可以方便我们提升程序的性能。</p>
<p><strong>对比实验:</strong></p>
<p>资料显示,如果多线程的进程是<strong>CPU密集型</strong>的,那多线程并不能有多少效率上的提升,相反还可能会因为线程的频繁切换,导致效率下降,推荐使用多进程;如果是<strong>IO密集型</strong>,多线程进程可以利用IO阻塞等待时的空闲时间执行其他线程,提升效率。所以我们根据实验对比不同场景的效率</p>
<p><strong>导包:</strong></p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">time</span>
<span class="kn">from</span> <span class="nn">threading</span> <span class="kn">import</span> <span class="n">Thread</span>
<span class="kn">from</span> <span class="nn">multiprocessing</span> <span class="kn">import</span> <span class="n">Process</span>
</code></pre></div>
<p><strong>定义CPU密集的程序:</strong></p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">count</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="c1"># 使程序完成50万计算</span>
<span class="n">c</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">while</span> <span class="n">c</span> <span class="o"><</span> <span class="mi">500000</span><span class="p">:</span>
<span class="n">c</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">x</span> <span class="o">+=</span> <span class="n">x</span>
<span class="n">y</span> <span class="o">+=</span> <span class="n">y</span>
</code></pre></div>
<p><strong>定义IO密集的程序:</strong></p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">write</span><span class="p">():</span>
<span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s2">"test.txt"</span><span class="p">,</span> <span class="s2">"w"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">5000000</span><span class="p">):</span>
<span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="s2">"testwrite</span><span class="se">\n</span><span class="s2">"</span><span class="p">)</span>
<span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">read</span><span class="p">():</span>
<span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s2">"test.txt"</span><span class="p">,</span> <span class="s2">"r"</span><span class="p">)</span>
<span class="n">lines</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">readlines</span><span class="p">()</span>
<span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</code></pre></div>
<p><strong>定义网络请求程序:</strong></p>
<div class="highlight"><pre><span></span><code><span class="n">_head</span> <span class="o">=</span> <span class="p">{</span>
<span class="s1">'User-Agent'</span><span class="p">:</span> <span class="s1">'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36'</span><span class="p">}</span>
<span class="n">url</span> <span class="o">=</span> <span class="s2">"http://www.tieba.com"</span>
<span class="k">def</span> <span class="nf">http_request</span><span class="p">():</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">webPage</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">headers</span><span class="o">=</span><span class="n">_head</span><span class="p">)</span>
<span class="n">html</span> <span class="o">=</span> <span class="n">webPage</span><span class="o">.</span><span class="n">text</span>
<span class="k">return</span> <span class="p">{</span><span class="s2">"context"</span><span class="p">:</span> <span class="n">html</span><span class="p">}</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">return</span> <span class="p">{</span><span class="s2">"error"</span><span class="p">:</span> <span class="n">e</span><span class="p">}</span>
</code></pre></div>
<p><strong>单线程执行上述程序的性能测试:</strong></p>
<div class="highlight"><pre><span></span><code><span class="c1"># CPU密集操作</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span>
<span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">):</span>
<span class="n">count</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Line cpu"</span><span class="p">,</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span> <span class="o">-</span> <span class="n">t</span><span class="p">)</span>
<span class="c1"># IO密集操作</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span>
<span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">):</span>
<span class="n">write</span><span class="p">()</span>
<span class="n">read</span><span class="p">()</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Line IO"</span><span class="p">,</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span> <span class="o">-</span> <span class="n">t</span><span class="p">)</span>
<span class="c1"># 网络请求密集型操作</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span>
<span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">):</span>
<span class="n">http_request</span><span class="p">()</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Line Http Request"</span><span class="p">,</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span> <span class="o">-</span> <span class="n">t</span><span class="p">)</span>
</code></pre></div>
<p><strong>输出:</strong></p>
<ul>
<li>CPU密集:95.6059999466、91.57099986076355 92.52800011634827、 99.96799993515015</li>
<li>IO密集:24.25、21.76699995994568、21.769999980926514、22.060999870300293</li>
<li>网络请求密集型: 4.519999980926514、8.563999891281128、4.371000051498413、4.522000074386597、14.671000003814697</li>
</ul>
<p><strong>对比分析:</strong></p>
<table>
<thead>
<tr>
<th align="left"></th>
<th align="left">CPU密集型操作</th>
<th align="left">IO密集型操作</th>
<th>网络请求密集型操作</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">线性操作</td>
<td align="left">94.91824996469</td>
<td align="left">22.46199995279</td>
<td>7.3296000004</td>
</tr>
<tr>
<td align="left">多线程操作</td>
<td align="left">101.1700000762</td>
<td align="left">24.8605000973</td>
<td>0.5053332647</td>
</tr>
<tr>
<td align="left">多进程操作</td>
<td align="left">53.8899999857</td>
<td align="left">12.7840000391</td>
<td>0.5045000315</td>
</tr>
</tbody>
</table>
<p><strong>结论</strong>:</p>
<p>多进程对于CPU密集操作有比较有效的优化,而多线程似乎对于IO密集操作并没有多大提升。其实原因在于cpython的GIL锁使得一个CPU同一时间只有一个线程在运行,否则理论上多线程会使得IO密集操作提升较为大的性能。</p>
<hr>
<blockquote>
<p>绝对不能用患病的、意识不清的爱来爱你自己,一个人首先要学会爱自己,以健康和正常的爱来爱自己。这样一个人才能与自己安然相处,而不是在外飘荡。</p>
</blockquote></div>
<hr>
<a href="https://twitter.com/share" class="twitter-share-button" data-text="python的多线程,多进程以及协程。" data-via="YXTkbHci1iEWgAM">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script>
<h2>Comments</h2>
<div id="disqus_thread"></div>
<script type="text/javascript">
var disqus_shortname = 'kuikui';
var disqus_title = 'python的多线程,多进程以及协程。';
(function() {
var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
dsq.src = 'https://' + disqus_shortname + '.disqus.com/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
</div>
</div>
<div class="span3">
<div class="well" style="padding: 8px 0; background-color: #FBFBFB;">
<ul class="nav nav-list">
<li class="nav-header">
Site
</li>
<li><a href="https://kuikui.tech/archives.html">Archives</a>
<li><a href="https://kuikui.tech/tags.html">Tags</a>
<li><a href="https://kuikui.tech/feeds/all.atom.xml" rel="alternate">Atom feed</a></li>
</ul>
</div>
<div class="well" style="padding: 8px 0; background-color: #FBFBFB;">
<ul class="nav nav-list">
<li class="nav-header">
Categories
</li>
<li><a href="https://kuikui.tech/category/gan-xiang-yu-ji-lu.html">感想与记录</a></li>
<li><a href="https://kuikui.tech/category/ji-qi-xue-xi.html">机器学习</a></li>
<li><a href="https://kuikui.tech/category/lei-bie.html">类别</a></li>
<li><a href="https://kuikui.tech/category/python.html">python</a></li>
<li><a href="https://kuikui.tech/category/shua-ti-bi-ji.html">刷题笔记</a></li>
<li><a href="https://kuikui.tech/category/sui-xiang.html">随想</a></li>
<li><a href="https://kuikui.tech/category/yue-du-bi-ji.html">阅读笔记</a></li>
<li><a href="https://kuikui.tech/category/yun-wei.html">运维</a></li>
</ul>
</div>
<div class="well" style="padding: 8px 0; background-color: #FBFBFB;">
<ul class="nav nav-list">
<li class="nav-header">
Links
</li>
<li><a href="https://github.com/kk456852/">Github</a></li>
<li><a href="https://twitter.com/YXTkbHci1iEWgAM/">twitter</a></li>
<li><a href="https://kuikui.tech/">blog</a></li>
</ul>
</div>
<div class="social">
<div class="well" style="padding: 8px 0; background-color: #FBFBFB;">
<ul class="nav nav-list">
<li class="nav-header">
Social
</li>
<li><a href="https://mahaoqu.github.io/">Teacher Ma</a></li>
<li><a href="#">虚位以待</a></li>
</ul>
</div>
</div>
</div>
</div> </div>
<footer>
<br />
<p><a href="https://kuikui.tech">kizzy的个人博客</a> © kizzy 2020</p>
</footer>
</div> <!-- /container -->
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>
<script src="https://kuikui.tech/theme/bootstrap-collapse.js"></script>
<script>var _gaq=[['_setAccount','UA-158839315-1'],['_trackPageview']];(function(d,t){var g=d.createElement(t),s=d.getElementsByTagName(t)[0];g.src='//www.google-analytics.com/ga.js';s.parentNode.insertBefore(g,s)}(document,'script'))</script>
</body>
</html>