index.xml

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
      <title>f(x) on f(x) </title>
      <generator uri="https://gohugo.io">Hugo</generator>
    <link>http://firoyang.org/</link>
    <language>en-us</language>
    <author>Firo Yang</author>
    
    <updated>Sat, 08 Jun 2019 00:00:00 UTC</updated>
    
    <item>
      <title>Linux kernel page allocation</title>
      <link>http://firoyang.org/cs/page_allocator/</link>
      <pubDate>Sat, 08 Jun 2019 00:00:00 UTC</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/cs/page_allocator/</guid>
      <description>

&lt;h1 id=&#34;gfp&#34;&gt;GFP&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://www.kernel.org/doc/html/latest/core-api/memory-allocation.html&#34;&gt;Memory Allocation Guide&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://www.kernel.org/doc/html/latest/core-api/mm-api.html#memory-allocation-controls&#34;&gt;Memory Allocation Controls&lt;/a&gt;&lt;br /&gt;
Also see include/linux/gfp.h&lt;/p&gt;

&lt;h2 id=&#34;removed-gfp-flags&#34;&gt;Removed GFP flags&lt;/h2&gt;

&lt;p&gt;__GFP_WAIT: mm, page_alloc: Rename __GFP_WAIT to __GFP_RECLAIM&lt;/p&gt;

&lt;h2 id=&#34;gfp-zone-table-and-gfp-zone-bad&#34;&gt;GFP_ZONE_TABLE and GFP_ZONE_BAD&lt;/h2&gt;

&lt;p&gt;commit b70d94ee438b3fd9c15c7691d7a932a135c18101&lt;br /&gt;
Refs: v2.6.30-5489-gb70d94ee438b&lt;br /&gt;
Author:     Christoph Lameter &lt;a href=&#34;mailto:cl@linux.com&#34;&gt;cl@linux.com&lt;/a&gt;&lt;br /&gt;
AuthorDate: Tue Jun 16 15:32:46 2009 -0700&lt;br /&gt;
    page-allocator: use integer fields lookup for gfp_zone and check for errors in flags passed to the page allocator&lt;br /&gt;
+ * GFP_ZONE_TABLE is a word size bitstring that is used for looking up the&lt;br /&gt;
+ * zone to use given the lowest 4 bits of gfp_t. Entries are ZONE_SHIFT long&lt;br /&gt;
+ * and there are 16 of them to cover all possible combinations of&lt;br /&gt;
+ * __GFP_DMA, __GFP_DMA32, __GFP_MOVABLE and __GFP_HIGHMEM&lt;br /&gt;
+ * The zone fallback order is MOVABLE=&amp;gt;HIGHMEM=&amp;gt;NORMAL=&amp;gt;DMA32=&amp;gt;DMA.&lt;br /&gt;
+ * But GFP_MOVABLE is not only a zone specifier but also an allocation&lt;br /&gt;
+ * policy. Therefore __GFP_MOVABLE plus another zone selector is valid.&lt;br /&gt;
+ * Only 1bit of the lowest 3 bit (DMA,DMA32,HIGHMEM) can be set to &amp;ldquo;1&amp;rdquo;.&lt;br /&gt;
+ *       bit       result&lt;br /&gt;
+ *       0x0    =&amp;gt; NORMAL&lt;br /&gt;
+ *       0x1    =&amp;gt; DMA or NORMAL&lt;br /&gt;
+ *       0x2    =&amp;gt; HIGHMEM or NORMAL&lt;br /&gt;
+ *       0x3    =&amp;gt; BAD (DMA+HIGHMEM)&lt;br /&gt;
+ *       0x4    =&amp;gt; DMA32 or DMA or NORMAL&lt;br /&gt;
+ *       0x5    =&amp;gt; BAD (DMA+DMA32)&lt;br /&gt;
+ *       0x6    =&amp;gt; BAD (HIGHMEM+DMA32)&lt;br /&gt;
+ *       0x7    =&amp;gt; BAD (HIGHMEM+DMA32+DMA)&lt;br /&gt;
+ *       0x8    =&amp;gt; NORMAL (MOVABLE+0)&lt;br /&gt;
+ *       0x9    =&amp;gt; DMA or NORMAL (MOVABLE+DMA)&lt;br /&gt;
+ *       0xa    =&amp;gt; MOVABLE (Movable is valid only if HIGHMEM is set too)&lt;br /&gt;
+ *       0xb    =&amp;gt; BAD (MOVABLE+HIGHMEM+DMA)&lt;br /&gt;
+ *       0xc    =&amp;gt; DMA32 (MOVABLE+HIGHMEM+DMA32)&lt;br /&gt;
+ *       0xd    =&amp;gt; BAD (MOVABLE+DMA32+DMA)&lt;br /&gt;
+ *       0xe    =&amp;gt; BAD (MOVABLE+DMA32+HIGHMEM)&lt;br /&gt;
+ *       0xf    =&amp;gt; BAD (MOVABLE+DMA32+HIGHMEM+DMA)&lt;/p&gt;

&lt;h1 id=&#34;alloc-flags&#34;&gt;Alloc flags&lt;/h1&gt;

&lt;p&gt;gfp_to_alloc_flags&lt;br /&gt;
ALLOC_HIGH: __zone_watermark_ok(): if (alloc_flags &amp;amp; ALLOC_HIGH) min -= min / 2;&lt;br /&gt;
ALLOC_HARDER: rmqueue(): if (alloc_flags &amp;amp; ALLOC_HARDER) { page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC);&lt;/p&gt;

&lt;h1 id=&#34;pf-memalloc&#34;&gt;PF_MEMALLOC&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://www.kernel.org/doc/gorman/html/understand/understand009.html&#34;&gt;Mel&amp;rsquo;s book on PF_MEMALLOC&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://lore.kernel.org/patchwork/cover/178099/&#34;&gt;Kill PF_MEMALLOC abuse&lt;/a&gt;&lt;br /&gt;
 get_page_from_freelist and __ac_get_obj&lt;br /&gt;
                 * page is set pfmemalloc is when ALLOC_NO_WATERMARKS was&lt;br /&gt;
                 * necessary to allocate the page. The expectation is&lt;br /&gt;
                 * that the caller is taking steps that will free more&lt;br /&gt;
                 * memory. The caller should avoid the page being used&lt;br /&gt;
                 * for !PFMEMALLOC purposes.&lt;br /&gt;
                if (alloc_flags &amp;amp; ALLOC_NO_WATERMARKS)&lt;br /&gt;
                        set_page_pfmemalloc(page);&lt;/p&gt;

&lt;h2 id=&#34;users-of-pf-memalloc&#34;&gt;Users of PF_MEMALLOC&lt;/h2&gt;

&lt;p&gt;kswapd and &lt;strong&gt;alloc_pages_direct_reclaim-&amp;gt;&lt;/strong&gt;perform_reclaim-&amp;gt;Set PF_MEMALLOC.&lt;br /&gt;
commit c93bdd0e03e848555d144eb44a1f275b871a8dd5&lt;br /&gt;
Author: Mel Gorman &lt;a href=&#34;mailto:mgorman@suse.de&#34;&gt;mgorman@suse.de&lt;/a&gt;&lt;br /&gt;
Date:   Tue Jul 31 16:44:19 2012 -0700&lt;br /&gt;
    netvm: allow skb allocation to use PFMEMALLOC reserves&lt;/p&gt;

&lt;h1 id=&#34;pf-swapwrite-swapwrite-originally-means-swap-space-but-now-stands-for-kswapd-or-zone-reclaim-and-migration&#34;&gt;PF_SWAPWRITE - swapwrite originally means swap space but now stands for kswapd or zone reclaim and migration?&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://lore.kernel.org/linux-mm/20051025193023.6828.89649.sendpatchset@schroedinger.engr.sgi.com/#r&#34;&gt;Swap Migration V4: Overview&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://lwn.net/Articles/157936/&#34;&gt;Swap Migration V5: Overview&lt;/a&gt;&lt;br /&gt;
commit 930d915252edda7042c944ed3c30194a2f9fe163&lt;br /&gt;
Refs: v2.6.15-1460-g930d915252ed&lt;br /&gt;
Author:     Christoph Lameter &lt;a href=&#34;mailto:clameter@sgi.com&#34;&gt;clameter@sgi.com&lt;/a&gt;&lt;br /&gt;
AuthorDate: Sun Jan 8 01:00:47 2006 -0800&lt;br /&gt;
    [PATCH] Swap Migration V5: PF_SWAPWRITE to allow writing to swap&lt;br /&gt;
    Add PF_SWAPWRITE to control a processes permission to write to swap.&lt;br /&gt;
    - Use PF_SWAPWRITE in may_write_to_queue() instead of checking for kswapd and pdflush&lt;br /&gt;
    - Set PF_SWAPWRITE flag for kswapd and pdflush&lt;/p&gt;

&lt;h2 id=&#34;firo&#34;&gt;Firo&lt;/h2&gt;

&lt;p&gt;The origianl migrations code &lt;a href=&#34;https://lore.kernel.org/linux-mm/20051025193039.6828.74991.sendpatchset@schroedinger.engr.sgi.com/&#34;&gt;swap_pages&lt;/a&gt;&lt;br /&gt;
seems I can remove it from migration code since it&amp;rsquo;s not used during migrating pages.&lt;br /&gt;
Could I remove it completely.&lt;/p&gt;

&lt;h1 id=&#34;zone-lists&#34;&gt;Zone lists&lt;/h1&gt;

&lt;p&gt;struct zonelist node_zonelists[MAX_ZONELISTS];&lt;br /&gt;
 * [0]  : Zonelist with fallback&lt;br /&gt;
 * [1]  : No fallback (__GFP_THISNODE)&lt;br /&gt;
start_kernel -&amp;gt; build_all_zonelists&lt;br /&gt;
or hotpulg or /proc/sys/vm/numa_zonelist_order: numa_zonelist_order_handler&lt;br /&gt;
  node_zonelists = {{               # Fallback zones: this zonelist including all zones from all nodes.&lt;br /&gt;
      _zonerefs = {{&lt;br /&gt;
          zone = 0xffff88107ffd5d80, # node 0&lt;br /&gt;
          zone_idx = 2&lt;br /&gt;
          zone = 0xffff88107ffd56c0, # node 0&lt;br /&gt;
          zone_idx = 1&lt;br /&gt;
          zone = 0xffff88107ffd5000, # node 0&lt;br /&gt;
          zone_idx = 0&lt;br /&gt;
          zone = 0xffff88207ffd2d80, # Node 1; fallback.&lt;br /&gt;
          zone_idx = 2&lt;br /&gt;
          zone = 0x0,&lt;br /&gt;
          zone_idx = 0&lt;br /&gt;
    &amp;hellip;}}}&lt;br /&gt;
    node_zonelists[1]           # Nofallback zones&lt;/p&gt;

&lt;h1 id=&#34;lqo&#34;&gt;LQO&lt;/h1&gt;

&lt;p&gt;[Driver porting: low-level memory allocation]&lt;a href=&#34;https://lwn.net/Articles/22909/)&#34;&gt;https://lwn.net/Articles/22909/)&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://lwn.net/Articles/627419/&#34;&gt;The &amp;ldquo;too small to fail&amp;rdquo; memory-allocation rule&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://lwn.net/Articles/723317/&#34;&gt;Revisiting &amp;ldquo;too small to fail&amp;rdquo;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;high-order-atomic-allocations&#34;&gt;High-order atomic allocations&lt;/h1&gt;

&lt;p&gt;commit 0aaa29a56e4fb0fc9e24edb649e2733a672ca099&lt;br /&gt;
Author: Mel Gorman &lt;a href=&#34;mailto:mgorman@techsingularity.net&#34;&gt;mgorman@techsingularity.net&lt;/a&gt;&lt;br /&gt;
Date:   Fri Nov 6 16:28:37 2015 -0800&lt;br /&gt;
    mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand&lt;/p&gt;

&lt;h1 id=&#34;hot-and-cold-pages-pcp-list&#34;&gt;Hot and cold pages, pcp list&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://lwn.net/Articles/14768/&#34;&gt;Hot and cold pages&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://patchwork.kernel.org/patch/10013971/&#34;&gt;mm, Remove cold parameter from free_hot_cold_page*&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;fair-zone-allocation-obsoleted-but-see-gfp-write&#34;&gt;Fair-zone allocation - obsoleted but see __GFP_WRITE&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://lore.kernel.org/patchwork/patch/691300/&#34;&gt;mm, page_alloc: Remove fair zone allocation policy&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://lwn.net/Articles/576778/&#34;&gt;Configurable fair allocation zone policy&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;compaction-and-reclamation&#34;&gt;Compaction and reclamation&lt;/h1&gt;

&lt;p&gt;Direct reclaim: do_try_to_free_pages vm_event_item ALLOCSTALL&lt;br /&gt;
Kswapd: balance_pgdat PAGEOUTRUN&lt;/p&gt;

&lt;h1 id=&#34;buddy-memory-system-1963-1965&#34;&gt;Buddy memory system 1963 ~ 1965&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;http://sci-hub.tw/https://dl.acm.org/citation.cfm?doid=365628.365655&#34;&gt;buddy system 1965 a fast storage allocator.&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://en.wikipedia.org/wiki/Buddy_memory_allocation&#34;&gt;Buddy memory allocation&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://dl.acm.org/citation.cfm?id=359626&#34;&gt;buddy system variants 1977&lt;/a&gt;&lt;br /&gt;
The following cited from above 1965 paper.&lt;br /&gt;
The oporations involved in obtaining blocks from and retm&amp;rsquo;ning thom to the free&lt;br /&gt;
storage lists aro vory fast, making this scheme particularly appropriate for list structure operations and for other&lt;br /&gt;
situations involving many sizes of blocks which are fixed in size and location. This is in fact tho storago bookkeeping&lt;br /&gt;
mothod used in tho Boll Telephone Laboratories Low-Level List Language&amp;rsquo;&lt;/p&gt;

&lt;h2 id=&#34;osidp&#34;&gt;OSIDP&lt;/h2&gt;

&lt;p&gt;Both fixed and dynamic partitioning schemes have drawbacks. A fixed partitioning&lt;br /&gt;
scheme limits the number of active processes and may use space inefficiently if there is&lt;br /&gt;
a poor match between available partition sizes and process sizes. A dynamic partition-&lt;br /&gt;
ing scheme is more complex to maintain and includes the overhead of compaction. An&lt;br /&gt;
interesting compromise is the buddy system&lt;/p&gt;

&lt;h2 id=&#34;translations&#34;&gt;Translations&lt;/h2&gt;

&lt;p&gt;free_area; page_is_buddy; PageBuddy(buddy) &amp;amp;&amp;amp; page_order(buddy)&lt;br /&gt;
setup_arch-&amp;gt;x86_init.paging.pagetable_init = native_pagetable_init&lt;br /&gt;
        sparse_init vmemmap_populate      # vmemmap&lt;br /&gt;
        zone_sizes_init free_area_init_core zone_pcp_init&lt;br /&gt;
        memmap_init_zone # Memory map a) Set all page to reserved. MIGRATE_MOVABLE? b) Set node, zone to page-&amp;gt;flags; set_page_links&lt;/p&gt;

&lt;h3 id=&#34;buddy-init&#34;&gt;Buddy init&lt;/h3&gt;

&lt;p&gt;mem_init-&amp;gt; memblock_free_all or free_all_bootmem # /* this will put all low memory onto the freelists */&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Kernel memory bug - SLAB&#39;s 3 lists are corrupted.</title>
      <link>http://firoyang.org/howto/bug_mm_1/</link>
      <pubDate>Wed, 02 Jan 2019 00:00:00 UTC</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/howto/bug_mm_1/</guid>
      <description>

&lt;p&gt;Recently, I was working on a kernel memory bug.&lt;/p&gt;

&lt;p&gt;&lt;a href=&#34;https://apibugzilla.suse.com/show_bug.cgi?id=1118875&#34;&gt;https://apibugzilla.suse.com/show_bug.cgi?id=1118875&lt;/a&gt;&lt;br /&gt;
L3: kernel BUG at ../mm/slab.c:2804! bad LRU list and active values in page structs in possible use-after-free&lt;/p&gt;

&lt;p&gt;After digging the binary vmcore file of kdump, I got the following findings.&lt;/p&gt;

&lt;h1 id=&#34;node-0&#34;&gt;Node 0&lt;/h1&gt;

&lt;h2 id=&#34;partial&#34;&gt;Partial&lt;/h2&gt;

&lt;p&gt;list page.lru  -H 0xffff8801a7c01348 -s page.lru,s_mem,active,slab_cache,flags &amp;gt;n0p.log&lt;br /&gt;
n0p -&amp;gt; n0f=0xffff8801a7c01358&lt;/p&gt;

&lt;h2 id=&#34;full&#34;&gt;Full&lt;/h2&gt;

&lt;p&gt;list page.lru  -H 0xffff8801a7c01358 -s page.lru,s_mem,active,slab_cache,flags &amp;gt;n0f.log&lt;br /&gt;
n0f -&amp;gt;&lt;br /&gt;
ffffea0006902380&lt;br /&gt;
    lru = {&lt;br /&gt;
      next = 0xffffea0080ed53e0,&lt;br /&gt;
      prev = 0xffffea00405f8ae0&lt;br /&gt;
    }&lt;br /&gt;
    s_mem = 0xffff8801a408e000&lt;br /&gt;
      active = 16&lt;br /&gt;
    slab_cache = 0xffff8801a7c00400&lt;br /&gt;
  flags = 6755398367314048&lt;br /&gt;
ffffea0080ed53c0&lt;br /&gt;
    lru = {&lt;br /&gt;
      next = 0xffffea00422a34e0,&lt;br /&gt;
      prev = 0xffffea00069023a0&lt;br /&gt;
    }&lt;br /&gt;
    s_mem = 0xffff88203b54f000&lt;br /&gt;
      active = 7&lt;br /&gt;
    slab_cache = 0xffff8801a7c00400&lt;br /&gt;
  flags = 24769796876796032&lt;br /&gt;
&amp;hellip; -&amp;gt; n1f = 0xffff881107c00358&lt;/p&gt;

&lt;h1 id=&#34;node-1&#34;&gt;Node 1&lt;/h1&gt;

&lt;h2 id=&#34;partial-1&#34;&gt;Partial&lt;/h2&gt;

&lt;p&gt;crash&amp;gt; list page.lru  -H 0xffff881107c00348 -s page.lru,s_mem,active,slab_cache,flags &amp;gt;n1p.log&lt;br /&gt;
nip-&amp;gt; SLAB ffffea0043ab74e0 -&amp;gt; 0xffff881107c00348 = n1p&lt;br /&gt;
SLAB ffffea0043ab74e0&amp;rsquo;s prev pointing to 0xffff881107c00358&lt;/p&gt;

&lt;h2 id=&#34;full-1&#34;&gt;Full&lt;/h2&gt;

&lt;p&gt;crash&amp;gt; list page.lru  -H 0xffff881107c00358 -s page.lru,s_mem,active,slab_cache,flags &amp;gt;n1f.log&lt;br /&gt;
n1f-&amp;gt; SLAB ffffea0043ab74e0  -&amp;gt; &amp;hellip; -&amp;gt; 0xffff881107c00348 = n1p&lt;/p&gt;

&lt;p&gt;This issue occured on a NUMA system with 2 memory nodes.&lt;br /&gt;
Both node 0 and node 1&amp;rsquo;s SLAB&amp;rsquo;s partial and full lists were corrupted. After looking into this issue a few days, I talked to Vlastimil Babka.&lt;br /&gt;
He provided a fix for this issue. That is 7810e6781e0fcbca78b91cf65053f895bf59e85f - mm, page_alloc: do not break __ GFP_THISNODE by zonelist reset.&lt;/p&gt;

&lt;p&gt;Now, I have a question: why did I cannot solve this issue?&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>memory mapping</title>
      <link>http://firoyang.org/cs/mem_map/</link>
      <pubDate>Wed, 22 Aug 2018 21:39:41 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/cs/mem_map/</guid>
      <description>

&lt;p&gt;This article is talking about user space Memory mmapping; it&amp;rsquo;s not limitted to mmap(2) system call.&lt;br /&gt;
&lt;a href=&#34;https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/com.ibm.aix.genprogc/understanding_mem_mapping.htm&#34;&gt;Understanding memory mapping&lt;/a&gt;&lt;br /&gt;
TLPI:chapter 49 and LSP: Chapter 8&lt;/p&gt;

&lt;h1 id=&#34;history&#34;&gt;History&lt;/h1&gt;

&lt;p&gt;BSD 4.2&lt;br /&gt;
1990 SunOS 4.1&lt;br /&gt;
&lt;a href=&#34;http://bitsavers.trailing-edge.com/pdf/sun/sunos/4.1/800-3846-10A_System_Services_Overview_199003.pdf&#34;&gt;A Must-read: The applications programmer gains access to the facilities of the VM system through several sets of system calls.&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;memory-mappings&#34;&gt;Memory mappings&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://landley.net/writing/memory-faq.txt&#34;&gt;What are memory mappings? - Landley&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A memory mapping is a set of page table entries describing the properties&lt;br /&gt;
of a consecutive virtual address range.  Each memory mapping has a&lt;br /&gt;
start address and length, permissions (such as whether the program can&lt;br /&gt;
read, write, or execute from that memory), and associated resources (such&lt;br /&gt;
as physical pages, swap pages, file contents, and so on).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1 id=&#34;vma&#34;&gt;VMA&lt;/h1&gt;

&lt;p&gt;vma&amp;rsquo;s unit is PAGE_SIZE;&lt;/p&gt;

&lt;h2 id=&#34;split-vma&#34;&gt;split_vma&lt;/h2&gt;

&lt;p&gt;new_below&lt;br /&gt;
commit 5846fc6c31162234e88bdfd91548b1cf0d2cebbd&lt;br /&gt;
Author: Andrew Morton &lt;a href=&#34;mailto:akpm@digeo.com&#34;&gt;akpm@digeo.com&lt;/a&gt;&lt;br /&gt;
Date:   Tue Sep 17 06:35:47 2002 -0700&lt;br /&gt;
    [PATCH] consolidate the VMA splitting code&lt;br /&gt;
new_below means the place where the old vma go to! Bad naming!&lt;br /&gt;
0 means the old will save the head part. 1 means tail part.&lt;/p&gt;

&lt;h1 id=&#34;release-memory-resources&#34;&gt;Release memory resources&lt;/h1&gt;

&lt;p&gt;exit_mm exit_mmap&lt;/p&gt;

&lt;h1 id=&#34;shared-memory-mapping&#34;&gt;Shared memory mapping&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://www.kernel.org/doc/gorman/html/understand/understand015.html&#34;&gt;Chapter 12  Shared Memory Virtual Filesystem:&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This is a very clean interface that is conceptually easy to understand but it does not help anonymous pages as there is no file backing. To keep this nice interface, Linux creates an artifical file-backing for anonymous pages using a RAM-based filesystem where each VMA is backed by a “file” in this filesystem. Every inode in the filesystem is placed on a linked list called shmem_inodes so that they may always be easily located. This allows the same file-based interface to be used without treating anonymous pages as a special case.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Firo: every time you create a shared memory via mmap(2), you create a inode with same name dev/zero in the hidden shm_mnt fs;&lt;br /&gt;
The name dev/zero is only a name. It has nothing related to /dev/zero in drivers/char/mem.c. And /dev/shm is only a tmpfs; it has nothing related shmemfs, but POSIX&amp;rsquo;s shm_open uses /dev/shm.&lt;/p&gt;

&lt;h2 id=&#34;shared-anonymouse-mappings&#34;&gt;Shared anonymouse mappings&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;https://lore.kernel.org/patchwork/patch/174306/&#34;&gt;vmscan: limit VM_EXEC protection to file pages&lt;/a&gt;&lt;br /&gt;
If someone may take advange of reclaimation code by mmap(&amp;hellip;, VM_EXEC, SHRED|ANON), OOM may occur since the old code protect it from reclaiming by add it back to the active list. Great patch. However, program running in tmpfs will also penalized.&lt;br /&gt;
page_is_file_cache &amp;lt; !PageAnon&lt;br /&gt;
&lt;a href=&#34;https://lwn.net/Articles/452035/&#34;&gt;ashmem&lt;/a&gt;&lt;br /&gt;
* onset - mmap&lt;br /&gt;
do_mmap -&amp;gt; mmap_region -&amp;gt; vma_link -&amp;gt; (__shmem_file_setup) &amp;amp;&amp;amp; __vma_link_file: into i_mmap interval_tree.&lt;br /&gt;
* nuclus - share fault&lt;br /&gt;
Read: do_read_fault&lt;br /&gt;
Write: do_shared_fault -&amp;gt; shmem_getpage_gfp shmem_add_to_page_cache&lt;br /&gt;
WP: do_wp_page -&amp;gt; wp_page_shared or wp_page_reuse&lt;br /&gt;
b)IPC using a shared file mapping&lt;/p&gt;

&lt;h2 id=&#34;history-1&#34;&gt;History&lt;/h2&gt;

&lt;p&gt;late 70s - IPC: see TLPI: Chapter 45 INTRODUCTION TO SYSTEM V IPC&lt;br /&gt;
they first appear together in Columbus UNIX, a Bell UNIX for database and efficient transaction processing&lt;br /&gt;
1983 - IPC See TLPI or wikipedia shared mmeory.&lt;br /&gt;
they land together in System V that made them popular in mainstream UNIX-es, hence the name&lt;/p&gt;

&lt;p&gt;1983 - BSD mmap with shared vs private memory mapping&lt;br /&gt;
BSD 4.2: The system supports sharing of data between processes by allowing pages to be mapped into memory. These mapped pages may be shared with other processes or private to the process.&lt;/p&gt;

&lt;p&gt;1984 Jan - BSD mmap with file memory mapping support by SunOS&lt;br /&gt;
The mmap seems firstly implemented by &lt;a href=&#34;http://bitsavers.trailing-edge.com/pdf/sun/sunos/1.1/800-1108-01E_System_Interface_Manual_for_the_Sun_Workstation_Jan84.pdf&#34;&gt;SunOS 1.1&lt;/a&gt;&lt;br /&gt;
N.B. This call is not completely implemented In 4.2(BSD).&lt;br /&gt;
More sunos docs: &lt;a href=&#34;http://bitsavers.trailing-edge.com/pdf/sun/sunos/&#34;&gt;http://bitsavers.trailing-edge.com/pdf/sun/sunos/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;1988&lt;br /&gt;
&lt;a href=&#34;https://en.wikipedia.org/wiki/Memory-mapped_file#History&#34;&gt;SunOS 4 introduced Unix&amp;rsquo;s mmap, which permitted programs &amp;ldquo;to map files into memory.&amp;rdquo;&lt;/a&gt;&lt;br /&gt;
1989&lt;br /&gt;
One paper found in OSTEP: &lt;a href=&#34;https://courses.cs.washington.edu/courses/cse551/09sp/papers/memory_coherence.pdf&#34;&gt;Memory Coherence in Shared Virtual Memory Systems&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;shared-memory-in-kernel&#34;&gt;Shared memory in kernel&lt;/h2&gt;

&lt;h3 id=&#34;initial-version&#34;&gt;Initial version&lt;/h3&gt;

&lt;p&gt;history: commit 9cb9f18b5d26bf176e13edbc0c248d121217c6b3&lt;br /&gt;
Refs: &lt;0.99.10&gt;&lt;br /&gt;
Author:     Linus Torvalds &lt;a href=&#34;mailto:torvalds@linuxfoundation.org&#34;&gt;torvalds@linuxfoundation.org&lt;/a&gt;&lt;br /&gt;
AuthorDate: Fri Nov 23 15:09:11 2007 -0500&lt;br /&gt;
    [PATCH] Linux-0.99.10 (June 7, 1993)&lt;br /&gt;
Firo: search &amp;lsquo;shm_swap&amp;rsquo;&lt;/p&gt;

&lt;h3 id=&#34;ramfs-based&#34;&gt;Ramfs based&lt;/h3&gt;

&lt;p&gt;history: commit 4d372877c63baaaf4c1c3325cae43f6b9782e59e&lt;br /&gt;
Refs: &lt;2.4.0-test13pre3&gt;&lt;br /&gt;
Author:     Linus Torvalds &lt;a href=&#34;mailto:torvalds@linuxfoundation.org&#34;&gt;torvalds@linuxfoundation.org&lt;/a&gt;&lt;br /&gt;
AuthorDate: Fri Nov 23 15:40:55 2007 -0500&lt;br /&gt;
[&amp;hellip;]&lt;br /&gt;
    The shmfs cleanup should be unnoticeable except to users who use SAP with&lt;br /&gt;
    huge shared memory segments, where Christoph Rohlands work not only&lt;br /&gt;
    makes the code much more readable, it should also make it dependable..&lt;br /&gt;
[&amp;hellip;]&lt;br /&gt;
    - Christoph Rohland: shmfs for shared memory handling&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>From the Nihilism</title>
      <link>http://firoyang.org/philosophy/nihilism/</link>
      <pubDate>Wed, 31 Jan 2018 00:00:00 UTC</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/philosophy/nihilism/</guid>
      <description>&lt;p&gt;虚无主义本质上是逻辑问题， 首先虚无主义者， 无法证明某种自己期望的意义的存在。 但这不能否定这种（生命/希望/生活/为之奋斗）的意义的存在的可能， 所以虚无主义者都被用身为人类的本能和自我意识不断思考而编成的绳子悬挂在这个世界。 稍有不慎这个绳子就可能断了， 走向死亡。&lt;/p&gt;

&lt;p&gt;无论虚无主义者多么绝望， 都不能理智层面上 否定 他所期望的意义的存在的可能。&lt;/p&gt;

&lt;p&gt;所以这个世界是可能存在意义的， 而不是彻底的无意义， 因为我们无法证明所有的事都是无意义的。&lt;/p&gt;

&lt;p&gt;依据我个人的感受， 虚无主义者并不是真的不想做任何事， 而是在内心深处， 认为这个世界不配；不能实心实意的把自己交给这个世界&amp;ndash; 依我来看&amp;ndash; 这个世界， 虽然历经人类数千年的打磨， 依然在物质和精神层面而言都是荒野/荒原。 人类被随意的放在这个荒野之中。&lt;/p&gt;

&lt;p&gt;虚无主义者， 只是真诚的思考者。 绝不是虚无主义者内心空虚，而是现在这个世界的虚无，没有给虚无主义者提供足够的意义， 才导致人成为虚无主义者。&lt;/p&gt;

&lt;p&gt;我们的虚无源自于我们的软弱。&lt;/p&gt;

&lt;p&gt;个人的力量始终是有限的， 这个世界上受虚无主义影响的人毕竟是少数， 所以虚无主义者们应该联合到一起， 透过科学的手段 去弄明白世界为什么存在。 很多虚无主义者会死在这个过程中。这是身为而人的短暂生命的悲哀。 虚无主义者应该拥有跨越千万年的生命， 因为很多时候我们舍弃了很多现实的诉求。 所以我们可能需要汇聚跨越数个世代的虚无主义者群体， 最终来完成这个目标。 了解世界为什么存在能找到所有事情的起源， 也就是意义的最开始，揭开所有的谜底 。&lt;/p&gt;

&lt;p&gt;同时， 可能存在某种意义， 某种创造者希望我们， 去完成的， 在未来等着我们。&lt;/p&gt;

&lt;p&gt;虚无主义并不全然是悲观的坏处， 至少他否定权威， 这会让我们在现实生活中和理性层面获得更多的自由， 减少某些欺骗导致的苦难。&lt;/p&gt;

&lt;p&gt;不可避免， 身为人， 虚无主义者， 为了找到某种意义， 我们要好好的活着， 尽管， 内心不能认同那是我们的意义， 但这是这片大地给我的馈赠&amp;ndash;自由，同事伴随而来的副产品， 束缚。 身为人的大地的束缚。&lt;/p&gt;

&lt;p&gt;虚无主义者, 很容易忽视自己的感受. 相对其他人类而言, 我们更容易委屈自己. 纯粹的理性逻辑的思维中, 现实与思想的联系被割裂, 更容易陷入思维的泥沼里, 难以自拔. 自己成为某种意义的前提条件, 所以搞懂自己就时必须的. 虚无主义者的阵地就是理性思维, 而来自外界感受, 易被忽视.自己至少是由理性思维和对外在的感受共同组成的.&lt;/p&gt;

&lt;p&gt;我不认同虚无主义只是纯粹主观的理性问题，但逻辑会帮我们理顺问题。从某种程度上，这是一个客观现实的问题。由此看来虚无主义是理性和现实共同引发的问题。理性可以容易通过能指表示问题，甚至不惧任何意义，而在浩如烟海的现实世界，人无迹所踪。甚至导致以为虚无主义是纯粹主观的理性的问题。从而忽略现实， 甚至忽略来自外界的感受， 既然外界的感受会是我们的意义的一部分， 了解自己的主观感受，使之达到就如同理性层面的批判的健康状态。当然我们不知道达到健康的主观感受对于我们寻找意义有如何的帮助。 我们只是在寻找意义的过程中。所谓的健康状态，从思维层面看体现的是理性的，批判的和潜在的自由的。主观感受要达到什么样状态呢？首先，是不应限制 约束理性层面的健康状态。 我们存在的基础就是, 我们理性中的自我意识. 我是谁, 谁是我. 个体的意识在外在的世界的影响想不断形成.可以说个人的意识,就是世界的意识. 世界本身也在寻找他的意志. 我是我, 我也是世界的一份.  同时世界也是矛盾的, 各种意识相互影响. 个体自身的意识, 使之遵循自身意义成为可能. 应当维护自身.  在这个世界上, 每个人都在追寻自己的意义, 可能是不自觉, 亦或是有目标， 但总之其他人的意义会影响我们，竞争是如此激烈，以致这个世界在阳光之下潜藏着满满地恶，偶尔包不住了，会泄漏出来。 所以维持自己的存在，对于人生来说格外重要， 这是所有的意义。 而生命中那些不能承受的轻，时不时的成为自我委屈的导火索。&lt;/p&gt;

&lt;p&gt;现实的感受对于维持个体主观意志的健康，是如此的必要。并非吹毛求疵。&lt;/p&gt;

&lt;p&gt;做到如此， 一个真正的自我，便浮现出来。 这边是人生意义所应该表述的内在，即我们在最大程度上保证自身的自由与健康，个人意志的最大程度的伸展与表述，即这是我们期待的自己。 反抗一切形式的压迫。&lt;/p&gt;

&lt;p&gt;搜寻生命的意义。 拥有未自省的人生的人，他所追寻的目标意义，很大程度上，是世界赋予的也就是 世界的意志 自身的本能的体现， 而并非自己真正的意图，也即Griffith 所言 有些人终其一生都不知为何而活，最终慢慢飘出这个世界。 又言 被梦想所奴役。&lt;/p&gt;

&lt;p&gt;相较于纯粹的理性思维，生命， 是否值得一个意义？&lt;/p&gt;

&lt;p&gt;那么多远大的意义 为什么不能给生命一个？&lt;/p&gt;

&lt;p&gt;生命/人生的意义不应该是唯一的.&lt;/p&gt;

&lt;p&gt;追寻自由，独立的意志&lt;br /&gt;
反抗剥削，压迫，奴役&lt;br /&gt;
反抗与逃离那些不被注意，却无处不在，弥漫在这个个社会上潜移默化地，悄无声息的使人变得畸形，制约个人自由的生长， 噤若寒蝉， 放弃生命本身自然的约束&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Memory consistency model</title>
      <link>http://firoyang.org/cs/consistency_model/</link>
      <pubDate>Sat, 16 Dec 2017 15:46:12 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/cs/consistency_model/</guid>
      <description>

&lt;p&gt;When we are talking on memory model, we are refering memory consistency model or memory ordering model.&lt;/p&gt;

&lt;h1 id=&#34;hisotry&#34;&gt;Hisotry&lt;/h1&gt;

&lt;p&gt;1979&lt;br /&gt;
&lt;a href=&#34;https://www.microsoft.com/en-us/research/uploads/prod/2016/12/How-to-Make-a-Multiprocessor-Computer-That-Correctly-Executes-Multiprocess-Programs.pdf&#34;&gt;How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Progranm&lt;/a&gt;&lt;br /&gt;
1987 ~ 1990&lt;br /&gt;
&lt;a href=&#34;https://cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf&#34;&gt;Linearizability: A Correctness Condition for Concurrent Objects&lt;/a&gt;&lt;br /&gt;
1989&lt;br /&gt;
&lt;a href=&#34;http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.8.3766&amp;amp;rep=rep1&amp;amp;type=pdf&#34;&gt;processor consistency: CACHE CONSISTENCY AND SEQUENTIAL CONSISTENCY&lt;/a&gt;&lt;br /&gt;
1990&lt;br /&gt;
&lt;a href=&#34;https://dl.acm.org/citation.cfm?id=325102&#34;&gt;Release consistency: Memory consistency and event ordering in scalable shared-memory multiprocessors&lt;/a&gt;&lt;br /&gt;
1991&lt;br /&gt;
&lt;a href=&#34;https://dl.acm.org/citation.cfm?id=113406&#34;&gt;Proving sequential consistency of high-performance shared memories&lt;/a&gt;&lt;br /&gt;
1992&lt;br /&gt;
&lt;a href=&#34;https://www.gaisler.com/doc/sparcv8.pdf&#34;&gt;TSO Sparc v8: A standard memory model called Total Store Ordering (TSO) is defined for SPARC&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://link.springer.com/chapter/10.1007/978-1-4615-3604-8_2&#34;&gt;Formal Specification of Memory Models: and two store ordered models TSO and PSO defined by the Sun Microsystem&amp;rsquo;s SPARC architecture.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2001 ~ Present&lt;br /&gt;
&lt;a href=&#34;https://www.youtube.com/watch?v=WUfvvFD5tAA&#34;&gt;IA64 memory ordering&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;purposes&#34;&gt;Purposes&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://www.cs.cmu.edu/afs/cs/academic/class/15418-s12/www/lectures/14_relaxedReview.pdf&#34;&gt;Motivation: hiding latency&lt;/a&gt;&lt;br /&gt;
▪ Why are we interested in relaxing ordering requirements?&lt;br /&gt;
- Performance&lt;br /&gt;
- Speci!cally, hiding memory latency: overlap memory accesses with other operations&lt;br /&gt;
- Remember, memory access in a cache coherent system may entail much more then&lt;br /&gt;
simply reading bits from memory (!nding data, sending invalidations, etc.)&lt;/p&gt;

&lt;h2 id=&#34;why-tso-it-s-because-that-write-buffer-or-store-buffer-is-not-invisible-any-more-for-multiprocessor-https-www-cis-upenn-edu-devietti-classes-cis601-spring2016-sc-tso-pdf&#34;&gt;Why TSO? &lt;a href=&#34;https://www.cis.upenn.edu/~devietti/classes/cis601-spring2016/sc_tso.pdf&#34;&gt;It&amp;rsquo;s because that write buffer or Store buffer is not invisible any more for multiprocessor&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;To abandon SC; to Allow use of a FIFO write buffer.&lt;br /&gt;
&lt;a href=&#34;https://www.cs.utexas.edu/~bornholt/post/memory-models.html&#34;&gt;An example: There’s no reason why performing event (2) (a read from B) needs to wait until event (1) (a write to A) completes. They don’t interfere with each other at all, and so should be allowed to run in parallel. See Memory Consistency Models: A Primer&lt;/a&gt;&lt;br /&gt;
Hide the write latency by putting the data in the store buffer.&lt;/p&gt;

&lt;h3 id=&#34;why-not-read-write-reordering&#34;&gt;Why not read-write reordering?&lt;/h3&gt;

&lt;p&gt;reordering read-write is non-sense.&lt;/p&gt;

&lt;h1 id=&#34;formal-cause&#34;&gt;Formal cause&lt;/h1&gt;

&lt;p&gt;Shared memory&lt;br /&gt;
Multiprocessor&lt;br /&gt;
Memory access&lt;br /&gt;
program order&lt;br /&gt;
&lt;a href=&#34;https://www.hpl.hp.com/techreports/Compaq-DEC/WRL-95-7.pdf&#34;&gt;Recommened by CAAQA: Observity in SC, TSO, PC: Paragraph Relaxing the Write to Read Program Order in Shared Memory Consistency Models: A Tutorial&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.06.07c.pdf&#34;&gt;Memory Barriers: a Hardware View for Software Hackers - must read&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;http://15418.courses.cs.cmu.edu/spring2013/article/41&#34;&gt;&amp;lsquo;A Summary of Relaxed Consistency&amp;rsquo; CMU&lt;/a&gt;&lt;a href=&#34;https://www.cs.cmu.edu/afs/cs/academic/class/15418-s12/www/lectures/14_relaxedReview.pdf&#34;&gt;Slides&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;sc&#34;&gt;SC&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;https://www.microsoft.com/en-us/research/uploads/prod/2016/12/How-to-Make-a-Multiprocessor-Computer-That-Correctly-Executes-Multiprocess-Programs.pdf&#34;&gt;sequential consistency&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://jepsen.io/consistency/models/sequential#formally&#34;&gt;Formal of Sequential Consistency by Jepsen&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;tso&#34;&gt;TSO&lt;/h2&gt;

&lt;p&gt;Total Store Ordering in Appendix k Sparc v8.&lt;/p&gt;

&lt;h3 id=&#34;tso-in-x86&#34;&gt;TSO in x86&lt;/h3&gt;

&lt;p&gt;&lt;a href=&#34;https://www.cl.cam.ac.uk/~pes20/weakmemory/x86tso-paper.tphols.pdf&#34;&gt;A Better x86 Memory Model: x86-TSO&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://stackoverflow.com/questions/27595595/when-are-x86-lfence-sfence-and-mfence-instructions-required&#34;&gt;When are x86 LFENCE, SFENCE and MFENCE instructions required?&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&#34;tso-vs-pc&#34;&gt;TSO vs PC:&lt;/h3&gt;

&lt;p&gt;&lt;a href=&#34;http://15418.courses.cs.cmu.edu/spring2013/article/41&#34;&gt;&amp;lsquo;A Summary of Relaxed Consistency&amp;rsquo; CMU&lt;/a&gt;&lt;a href=&#34;https://www.cs.cmu.edu/afs/cs/academic/class/15418-s12/www/lectures/14_relaxedReview.pdf&#34;&gt;Slides&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&#34;tso-and-peterson-s-algorithm&#34;&gt;TSO and Peterson&amp;rsquo;s algorithm&lt;/h3&gt;

&lt;p&gt;&lt;a href=&#34;https://bartoszmilewski.com/2008/11/05/who-ordered-memory-fences-on-an-x86/&#34;&gt;Who ordered memory fences on an x86?&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://www.cnblogs.com/caidi/p/6708789.html&#34;&gt;共同进入与饥饿&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;pc&#34;&gt;PC&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.8.3766&amp;amp;rep=rep1&amp;amp;type=pdf&#34;&gt;processor consistency: CACHE CONSISTENCY AND SEQUENTIAL CONSISTENCY&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;wc&#34;&gt;WC&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;https://people.eecs.berkeley.edu/~kubitron/cs252/handouts/oldquiz/p434-dubois.pdf&#34;&gt;weak consistency: Memory access buffering in multiprocessors&lt;/a&gt;&lt;br /&gt;
They distinguish between ordinary shared accesses and synchronization accesses, where the latter are used to control concurrency&lt;br /&gt;
between several processes and to maintain the integrity of ordinary shared data.&lt;/p&gt;

&lt;h2 id=&#34;rc&#34;&gt;RC&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;https://dl.acm.org/citation.cfm?id=325102&#34;&gt;Firo: a must-read: Release consistency: Memory consistency and event ordering in scalable shared-memory multiprocessors&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://docs.microsoft.com/en-us/windows/win32/dxtecharts/lockless-programming?redirectedfrom=MSDN#read-acquire-and-write-release-barriers&#34;&gt;Must-read: Lockless Programming Considerations for Xbox 360 and Microsoft Windows&lt;/a&gt;&lt;br /&gt;
At right top of page 6&lt;br /&gt;
Condition 3.1: Conditions for Release Consistency&lt;br /&gt;
(A) before an ordinary load or store access is allowed to perform with respect to any other processor,&lt;br /&gt;
all previous acquire accesses must be performed, and&lt;br /&gt;
(B) before a release access is allowed to perform with&lt;br /&gt;
respect to any other processor, all previous ordinary&lt;br /&gt;
load and store accesses must be performed, and&lt;br /&gt;
&amp;copy; special accesses are processor consistent with respect to one another.&lt;br /&gt;
&lt;a href=&#34;https://preshing.com/20120913/acquire-and-release-semantics/&#34;&gt;Acquire and Release Semantics&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>The definitive guide to Linux x86 entries</title>
      <link>http://firoyang.org/cs/entry/</link>
      <pubDate>Wed, 26 Apr 2017 21:39:41 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/cs/entry/</guid>
      <description>

&lt;h1 id=&#34;all-entries&#34;&gt;All entries&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://www.kernel.org/doc/Documentation/x86/entry_64.txt&#34;&gt;Documentation/x86/entry_64.txt&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;entry-irq&#34;&gt;Entry irq&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;http://www.lenky.info/archives/2013/03/2245&#34;&gt;对Linux x86-64架构上硬中断的重新认识&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;steps-to-handle-intterrupt&#34;&gt;Steps to handle intterrupt&lt;/h2&gt;

&lt;p&gt;For logical address to linear address, see intel SDM v3a 3.4 LOGICAL AND LINEAR ADDRESSES.&lt;br /&gt;
For stack switching during escalate the CPL, see SDM v3a 5.8.5 stack switching. The processor will automatically chose the espCPL stack to use during changing in privilege level.&lt;br /&gt;
For more details on stack switching, please check the Figure 5-13. Stack Switching During an Interprivilege-Level Call&lt;br /&gt;
For fast system call, check 3a 5.8.7 Performing Fast Calls to System Procedures&lt;br /&gt;
For TSS and TR, check 3a 7.2&lt;br /&gt;
For Linux hanld irq processes, check ULK 3rd Chapter 4: Hardware Handling of Interrupts and Exceptions&lt;/p&gt;

&lt;h1 id=&#34;entry-exception&#34;&gt;Entry exception&lt;/h1&gt;

&lt;h2 id=&#34;paranoid-entry&#34;&gt;paranoid_entry&lt;/h2&gt;

&lt;p&gt;Check Documentation/x86/entry_64.txt&lt;/p&gt;

&lt;h2 id=&#34;error-entry&#34;&gt;error_entry&lt;/h2&gt;

&lt;p&gt;tglx: commit 0457d99a336be658cea1a5bdb689de5adb3b382d&lt;br /&gt;
Author:     Andi Kleen &lt;a href=&#34;mailto:ak@muc.de&#34;&gt;ak@muc.de&lt;/a&gt;&lt;br /&gt;
AuthorDate: Tue Feb 12 20:17:35 2002 -0800&lt;br /&gt;
Commit:     Linus Torvalds &lt;a href=&#34;mailto:torvalds@home.transmeta.com&#34;&gt;torvalds@home.transmeta.com&lt;/a&gt;&lt;br /&gt;
CommitDate: Tue Feb 12 20:17:35 2002 -0800&lt;br /&gt;
    [PATCH] x86_64 merge: arch + asm&lt;/p&gt;

&lt;h1 id=&#34;entry-system-calls&#34;&gt;Entry system calls&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://blog.packagecloud.io/eng/2016/04/05/the-definitive-guide-to-linux-system-calls/&#34;&gt;The Definitive Guide to Linux System Calls&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;fast-path&#34;&gt;Fast path&lt;/h2&gt;

&lt;p&gt;commit 21d375b6b34ff511a507de27bf316b3dde6938d9&lt;br /&gt;
Author: Andy Lutomirski &lt;a href=&#34;mailto:luto@kernel.org&#34;&gt;luto@kernel.org&lt;/a&gt;&lt;br /&gt;
Date:   Sun Jan 28 10:38:49 2018 -0800&lt;br /&gt;
    x86/entry/64: Remove the SYSCALL64 fast path&lt;/p&gt;

&lt;h2 id=&#34;sysenter-vs-syscall&#34;&gt;sysenter vs syscall&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;https://groups.google.com/forum/#!topic/comp.arch/CjDs4MJCBow%5B1-25%5D&#34;&gt;SYSENTER/SYSEXIT vs.SYSCALL/SYSRET&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;http://arkanis.de/weblog/2017-01-05-measurements-of-system-call-performance-and-overhead&#34;&gt;Measurements of system call performance and overhead&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://reverseengineering.stackexchange.com/a/16511/16996&#34;&gt;AMD vs Intel and syscall vs sysenter&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://www.codeguru.com/cpp/misc/misc/system/article.php/c8223/System-Call-Optimization-with-the-SYSENTER-Instruction.htm&#34;&gt;System Call Optimization with the SYSENTER Instruction&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;http://articles.manugarg.com/systemcallinlinux2_6.html&#34;&gt;Sysenter Based System Call Mechanism in Linux 2.6&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;system-call-restart-mechanism-and-orig-eax&#34;&gt;system call restart mechanism and ORIG_EAX&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;https://lwn.net/Articles/17744/&#34;&gt;A new system call restart mechanism&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://lkml.org/lkml/2006/8/29/350&#34;&gt;Why set ORIG_EAX(%esp) to -1 in arch/i386/kernel/entry.S:error_code?&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;kernel-implementations&#34;&gt;kernel implementations&lt;/h2&gt;

&lt;p&gt;arch/x86/include/asm/proto.h&lt;br /&gt;
64-bit long mode: syscall; check syscall_init&lt;br /&gt;
64-bit compatible kernel: sysenter, syscall, or int 0x80; check __kernel_vsyscall and def_idts&lt;br /&gt;
32-bit kernel: int 0x80, sysenter;&lt;/p&gt;

&lt;h3 id=&#34;64-bit-without-compat-32-compatible-kernel-support&#34;&gt;64-bit without COMPAT_32/compatible kernel support&lt;/h3&gt;

&lt;p&gt;./int80&lt;br /&gt;
[  730.583700] traps: int80[1697] general protection ip:4000c4 sp:7ffd84b59730 error:402 in int80[400000+1000]&lt;br /&gt;
Segmentation fault (core dumped)&lt;/p&gt;

&lt;h2 id=&#34;x86-64-rcx-and-r10&#34;&gt;x86_64 rcx and r10&lt;/h2&gt;

&lt;p&gt;Check x86_64 ABI: Linux conventions and  according to &lt;a href=&#34;https://www.felixcloutier.com/x86/syscall&#34;&gt;x86 syscall instruction&lt;/a&gt;, rcx is used to passing next rip.&lt;br /&gt;
According to entry_SYSCALL_64, rcx is rip before it is pushed on the kernel stack. So r10 is right 4th args passed from userspace.&lt;br /&gt;
According to do_syscall_64, regs-&amp;gt;ax = sys_call_table&lt;a href=&#34;regs-&amp;gt;di, regs-&amp;gt;si, regs-&amp;gt;dx, regs-&amp;gt;r10, regs-&amp;gt;r8, regs-&amp;gt;r9&#34;&gt;nr&lt;/a&gt;;&lt;/p&gt;

&lt;h2 id=&#34;x86-32-asmlinkage&#34;&gt;x86_32 asmlinkage&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;https://qr.ae/Ti5MJJ&#34;&gt;By default gcc passes parameters on the stack for x86-32 arch, so what is it needed for? It&amp;rsquo;s because linux kernel uses -mregparm=3 option which overrides the default behaviour&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://lwn.net/Articles/67175/&#34;&gt;enbaled -mregparm=3 Shrinking the kernel with gcc&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://kernelnewbies.org/FAQ/asmlinkage&#34;&gt;What is asmlinkage?&lt;/a&gt;&lt;br /&gt;
However, for C functions invoked from assembly code, we should explicitly declare the function&amp;rsquo;s calling convention, because the parameter passing code in assembly side has been fixed. Show all predefined macros for your compiler&lt;/p&gt;

&lt;h2 id=&#34;hacking&#34;&gt;Hacking&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;https://www.exploit-db.com/papers/13146&#34;&gt;Obtain sys_call_table on amd64(x86_64)&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;vdso&#34;&gt;vDSO&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;http://www.linuxjournal.com/content/creating-vdso-colonels-other-chicken?page=0,0&#34;&gt;Creating a vDSO: the Colonel&amp;rsquo;s Other Chicken&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;http://www.trilithium.com/johan/2005/08/linux-gate/&#34;&gt;What is linux-gate.so.1&lt;/a&gt;&lt;br /&gt;
glibc -&amp;gt; AT_SYSINFO-&amp;gt; __kernel_vsyscall -&amp;gt; sysenter/syscall/in0x80&lt;br /&gt;
just for vDSO syscalls&lt;br /&gt;
glibc -&amp;gt; AT_SYSINFO_EHDR-&amp;gt; vDSO elf&lt;br /&gt;
&lt;a href=&#34;https://lwn.net/Articles/446528/&#34;&gt;On vsyscalls and the vDSO&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;http://blog.tinola.com/?e=5&#34;&gt;linux syscalls on x86 64&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Softirq of Linux Kernel</title>
      <link>http://firoyang.org/cs/softirq/</link>
      <pubDate>Mon, 03 Apr 2017 13:09:05 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/cs/softirq/</guid>
      <description>

&lt;h1 id=&#34;the-old-bottom-half&#34;&gt;The old bottom half&lt;/h1&gt;

&lt;p&gt;ULK 1st: 4.6.6 Bottom Half&lt;br /&gt;
History: commit ad09492558ffa7c67f2b58d23d04dce9ffb9b9dd (tag: 0.99)&lt;br /&gt;
Author: Linus Torvalds &lt;a href=&#34;mailto:torvalds@linuxfoundation.org&#34;&gt;torvalds@linuxfoundation.org&lt;/a&gt;&lt;br /&gt;
Date:   Fri Nov 23 15:09:07 2007 -0500&lt;br /&gt;
    [PATCH] Linux-0.99 (December 13, 1992)&lt;br /&gt;
Firo: There isn&amp;rsquo;t to much useful comment. But the code is very simple. Search bh_base.&lt;/p&gt;

&lt;h1 id=&#34;task-queue&#34;&gt;task queue&lt;/h1&gt;

&lt;p&gt;history: commit 98606bddf430f0a60d21fba93806f4e3c736b170 (tag: 1.1.13)&lt;br /&gt;
Author: Linus Torvalds &lt;a href=&#34;mailto:torvalds@linuxfoundation.org&#34;&gt;torvalds@linuxfoundation.org&lt;/a&gt;&lt;br /&gt;
Date:   Fri Nov 23 15:09:30 2007 -0500&lt;br /&gt;
    Import 1.1.13&lt;br /&gt;
+ * New proposed &amp;ldquo;bottom half&amp;rdquo; handlers:&lt;br /&gt;
+ * &amp;copy; 1994 Kai Petzke, wpp@marie.physik.tu-berlin.de&lt;br /&gt;
+ * Advantages:&lt;br /&gt;
+ * - Bottom halfs are implemented as a linked list.  You can have as many&lt;br /&gt;
+ *   of them, as you want.&lt;br /&gt;
+ * - No more scanning of a bit field is required upon call of a bottom half.&lt;br /&gt;
+ * - Support for chained bottom half lists.  The run_task_queue() function can be&lt;br /&gt;
+ *   used as a bottom half handler.  This is for example usefull for bottom&lt;br /&gt;
+ *   halfs, which want to be delayed until the next clock tick.&lt;br /&gt;
+ * Problems:&lt;br /&gt;
+ * - The queue_task_irq() inline function is only atomic with respect to itself.&lt;br /&gt;
+ *   Problems can occur, when queue_task_irq() is called from a normal system&lt;br /&gt;
+ *   call, and an interrupt comes in.  No problems occur, when queue_task_irq()&lt;br /&gt;
+ *   is called from an interrupt or bottom half, and interrupted, as run_task_queue()&lt;br /&gt;
+ *   will not be executed/continued before the last interrupt returns.  If in&lt;br /&gt;
+ *   doubt, use queue_task(), not queue_task_irq().&lt;br /&gt;
+ * - Bottom halfs are called in the reverse order that they were linked into&lt;br /&gt;
+ *   the list.&lt;br /&gt;
+struct tq_struct {&lt;br /&gt;
Check ULK2nd 4.7.3.1 Extending a bottom half for task queues, especially tq_context and keventd&lt;br /&gt;
The Old Task Queue Mechanism in LKD3rd. Cition from it below.&lt;br /&gt;
&lt;a href=&#34;https://lwn.net/Articles/11351/&#34;&gt;The end of task queues&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;softirq&#34;&gt;Softirq&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;http://www.cs.unca.edu/brock/classes/Spring2013/csci331/notes/paper-1130.pdf&#34;&gt;I’ll Do It Later: Softirqs, Tasklets, Bottom Halves, Task Queues, Work Queues and Timers&lt;/a&gt;&lt;br /&gt;
* not allow execute nest but can recusive lock:local_bh_disable&lt;br /&gt;
current-&amp;gt;preemt_count + SOFIRQ_OFFSET also disable preempt current process.&lt;br /&gt;
* hardirq on, can&amp;rsquo;t sleep&lt;br /&gt;
* not percpu&lt;/p&gt;

&lt;h1 id=&#34;occassions-of-softirq&#34;&gt;Occassions of Softirq&lt;/h1&gt;

&lt;p&gt;irq_exit()&lt;br /&gt;
re-enables softirq, local_bh_enable/spin_unlock_bh(); explicity checks executes, netstack/blockIO.&lt;br /&gt;
ksoftirqd&lt;/p&gt;

&lt;h1 id=&#34;tasklet&#34;&gt;Tasklet&lt;/h1&gt;

&lt;p&gt;History: commit 6cc120a8e71a8d124bf6411fc6e730a884b82701 (tag: 2.3.43pre7)&lt;br /&gt;
Author: Linus Torvalds &lt;a href=&#34;mailto:torvalds@linuxfoundation.org&#34;&gt;torvalds@linuxfoundation.org&lt;/a&gt;&lt;br /&gt;
Date:   Fri Nov 23 15:30:52 2007 -0500&lt;br /&gt;
    Import 2.3.43pre7&lt;br /&gt;
+ Tasklets &amp;mdash; multithreaded analogue of BHs.&lt;br /&gt;
+   Main feature differing them of generic softirqs: tasklet&lt;br /&gt;
+   is running only on one CPU simultaneously.&lt;br /&gt;
+   Main feature differing them of BHs: different tasklets&lt;br /&gt;
+   may be run simultaneously on different CPUs.&lt;br /&gt;
+   Properties:&lt;br /&gt;
+   * If tasklet_schedule() is called, then tasklet is guaranteed&lt;br /&gt;
+     to be executed on some cpu at least once after this.&lt;br /&gt;
+   * If the tasklet is already scheduled, but its excecution is still not&lt;br /&gt;
+     started, it will be executed only once.&lt;br /&gt;
+   * If this tasklet is already running on another CPU (or schedule is called&lt;br /&gt;
+     from tasklet itself), it is rescheduled for later.&lt;br /&gt;
+   * Tasklet is strictly serialized wrt itself, but not&lt;br /&gt;
+     wrt another tasklets. If client needs some intertask synchronization,&lt;br /&gt;
+     he makes it with spinlocks.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Softirq of Linux Kernel</title>
      <link>http://firoyang.org/dark_ages/softirq/</link>
      <pubDate>Mon, 03 Apr 2017 13:09:05 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/dark_ages/softirq/</guid>
      <description>

&lt;p&gt;##softirq&lt;br /&gt;
同一个softirq可以在不同的CPU上同时运行，softirq必须是可重入的。&lt;br /&gt;
* not allow execute nest but can recusive lock:local_bh_disable&lt;br /&gt;
current-&amp;gt;preemt_count + SOFIRQ_OFFSET also disable preempt current process.&lt;br /&gt;
* hardirq on, can&amp;rsquo;t sleep&lt;br /&gt;
* not percpu&lt;/p&gt;

&lt;h2 id=&#34;tasklet-and-kernel-timer-is-based-on-softirq&#34;&gt;tasklet and kernel timer is based on softirq&lt;/h2&gt;

&lt;p&gt;新增softirq, 是要重新编译内核的, 试试tasklet也不错.&lt;br /&gt;
.不允许两个两个相同类型的tasklet同时执行，即使在不同的处理器上&lt;br /&gt;
* First of all, it&amp;rsquo;s a conglomerate of mostly unrelated jobs,&lt;br /&gt;
 which run in the context of a randomly chosen victim&lt;br /&gt;
 w/o the ability to put any control on them. &amp;ndash;Thomas Gleixner&lt;/p&gt;

&lt;p&gt;tasklet different with other softirq is run  signal cpu core&lt;br /&gt;
spinlock_bh wider then spinlock&lt;/p&gt;

&lt;p&gt;###time of softirq&lt;br /&gt;
* follow hardirq, irq_exit()&lt;br /&gt;
* re-enables softirq, local_bh_enable/spin_unlock_bh(); explicity checks executes, netstack/blockIO.&lt;br /&gt;
* ksoftirqd&lt;/p&gt;

&lt;p&gt;###tasklet&lt;br /&gt;
tasklet like a workqueue, sofirq like kthread. that is wonderful, does it?&lt;br /&gt;
tasklet 被__tasklet_schedule到某个cpu的percu 变量tasklet_vec.tail上保证了&lt;br /&gt;
只有一个cpu执行同一时刻.&lt;/p&gt;

&lt;p&gt;#FAQ&lt;br /&gt;
##When to save irq rather than just disable irq&lt;br /&gt;
local_irq_disable() used in the code path that never disabled interrupts.&lt;br /&gt;
local_irq_save(flags) used in the code path that already disabled interrupts.&lt;/p&gt;

&lt;p&gt;##what about irq nested?&lt;br /&gt;
&lt;a href=&#34;http://lwn.net/Articles/380937/&#34;&gt;http://lwn.net/Articles/380937/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&#34;http://thread.gmane.org/gmane.linux.kernel/1152658&#34;&gt;Deal PF_MEMALLOC in softirq&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>x86 interrupt and exception</title>
      <link>http://firoyang.org/cs/event/</link>
      <pubDate>Mon, 03 Apr 2017 13:02:12 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/cs/event/</guid>
      <description>

&lt;h1 id=&#34;events&#34;&gt;Events&lt;/h1&gt;

&lt;p&gt;Interrupts: asynonymous(passively received), external&lt;br /&gt;
Exception: synonymous(actively detected), internal&lt;br /&gt;
Software interrupts: is a trap. int/int3, into, bound.&lt;br /&gt;
IPI&lt;br /&gt;
&lt;a href=&#34;https://www.youtube.com/watch?v=-pehAzaP1eg&#34;&gt;IRQs: the Hard, the Soft, the Threaded and the Preemptible&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://www.youtube.com/watch?v=YE8cRHVIM4E&#34;&gt;How Dealing with Modern Interrupt Architectures can Affect Your Sanity&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;stack-management&#34;&gt;stack management&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://www.kernel.org/doc/html/latest/x86/kernel-stacks.html&#34;&gt;x86_64 IST Stacks in kernel&lt;/a&gt;&lt;br /&gt;
6.14.4 Stack Switching in IA-32e Mode&lt;br /&gt;
irq_stack_union&lt;/p&gt;

&lt;h2 id=&#34;backtrace&#34;&gt;backtrace&lt;/h2&gt;

&lt;p&gt;commit a2bbe75089d5eb9a3a46d50dd5c215e213790288&lt;br /&gt;
x86: Don&amp;rsquo;t use frame pointer to save old stack on irq entry&lt;br /&gt;
       /* Save previous stack value &lt;em&gt;/&lt;br /&gt;
       movq %rsp, %rsi&lt;br /&gt;
&amp;hellip;&lt;br /&gt;
2:     /&lt;/em&gt; Store previous stack value */&lt;br /&gt;
       pushq %rsi&lt;br /&gt;
&lt;a href=&#34;https://lore.kernel.org/patchwork/patch/736894/&#34;&gt;Firo: end of EOI; x86/dumpstack: make stack name tags more comprehensible&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;concurrency-nested&#34;&gt;Concurrency, nested?&lt;/h1&gt;

&lt;h2 id=&#34;mask-exception&#34;&gt;Mask exception&lt;/h2&gt;

&lt;p&gt;RF in EFLAGS for masking #DB&lt;br /&gt;
&lt;a href=&#34;https://stackoverflow.com/a/1581729/1025001&#34;&gt;Does sti/cli affect software interrupt&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;irq-nested&#34;&gt;irq nested?&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;http://lwn.net/Articles/380937/&#34;&gt;Prevent nested interrupts when the IRQ stack is near overflowing v2&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;http://www.lenky.info/archives/2013/03/2245&#34;&gt;对Linux x86-64架构上硬中断的重新认识&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&#34;firo-clear-the-flags-for-pf-through-interrupt-gate&#34;&gt;Firo: clear the flags for PF through interrupt gate&lt;/h3&gt;

&lt;p&gt;v3a: 6.12.1 Exception- or Interrupt-Handler Procedures&lt;br /&gt;
6.12.1.2 Flag Usage By Exception- or Interrupt-Handler Procedure&lt;/p&gt;

&lt;h2 id=&#34;synchronization&#34;&gt;synchronization&lt;/h2&gt;

&lt;p&gt;local_irq_disable() used in the code path that never disabled interrupts.&lt;br /&gt;
local_irq_save(flags) used in the code path that already disabled interrupts.&lt;/p&gt;

&lt;h2 id=&#34;in-interrupt&#34;&gt;in_interrupt&lt;/h2&gt;

&lt;p&gt;383 static inline void tick_irq_exit(void)&lt;br /&gt;
384 {&lt;br /&gt;
385 #ifdef CONFIG_NO_HZ_COMMON&lt;br /&gt;
386         int cpu = smp_processor_id();&lt;br /&gt;
387&lt;br /&gt;
388         /* Make sure that timer wheel updates are propagated &lt;em&gt;/&lt;br /&gt;
389         if ((idle_cpu(cpu) &amp;amp;&amp;amp; !need_resched()) || tick_nohz_full_cpu(cpu)) {&lt;br /&gt;
390                 if (!in_interrupt())&lt;br /&gt;
391                         tick_nohz_irq_exit();&lt;br /&gt;
392         }&lt;br /&gt;
393 #endif&lt;br /&gt;
394 }&lt;br /&gt;
395&lt;br /&gt;
396 /&lt;/em&gt;&lt;br /&gt;
397  * Exit an interrupt context. Process softirqs if needed and possible:&lt;br /&gt;
398  */&lt;br /&gt;
399 void irq_exit(void)&lt;br /&gt;
400 {&lt;br /&gt;
401 #ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED&lt;br /&gt;
402         local_irq_disable();&lt;br /&gt;
403 #else&lt;br /&gt;
404         lockdep_assert_irqs_disabled();&lt;br /&gt;
405 #endif&lt;br /&gt;
406         account_irq_exit_time(current);&lt;br /&gt;
407         preempt_count_sub(HARDIRQ_OFFSET);&lt;br /&gt;
408         if (!in_interrupt() &amp;amp;&amp;amp; local_softirq_pending())&lt;br /&gt;
409                 invoke_softirq();&lt;br /&gt;
410&lt;br /&gt;
411         tick_irq_exit();&lt;/p&gt;

&lt;h1 id=&#34;exceptions&#34;&gt;Exceptions&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;http://wiki.osdev.org/Exceptions&#34;&gt;Exceptions&lt;/a&gt;&lt;br /&gt;
related code:&lt;br /&gt;
do_nmi do_int3 debug_stack_usage_inc, debug_idt_descr, debug_idt_table,&lt;/p&gt;

&lt;h2 id=&#34;faults-a-fault-is-an-exception-that-can-generally-be-corrected-and-that-once-corrected-allows-the-program&#34;&gt;Faults — A fault is an exception that can generally be corrected and that, once corrected, allows the program&lt;/h2&gt;

&lt;p&gt;to be restarted with no loss of continuity. When a fault is reported, the processor restores the machine state to&lt;br /&gt;
the state prior to the beginning of execution of the faulting instruction. The return address (saved contents of&lt;br /&gt;
the CS and EIP registers) for the fault handler points to the faulting instruction, rather than to the instruction&lt;br /&gt;
following the faulting instruction.&lt;/p&gt;

&lt;h2 id=&#34;traps-a-trap-is-an-exception-that-is-reported-immediately-following-the-execution-of-the-trapping-instruction&#34;&gt;Traps — A trap is an exception that is reported immediately following the execution of the trapping instruction.&lt;/h2&gt;

&lt;p&gt;Traps allow execution of a program or task to be continued without loss of program continuity. The return&lt;br /&gt;
address for the trap handler points to the instruction to be executed after the trapping instruction.&lt;/p&gt;

&lt;h2 id=&#34;aborts-an-abort-is-an-exception-that-does-not-always-report-the-precise-location-of-the-instruction-causing&#34;&gt;Aborts — An abort is an exception that does not always report the precise location of the instruction causing&lt;/h2&gt;

&lt;p&gt;the exception and does not allow a restart of the program or task that caused the exception. Aborts are used to&lt;br /&gt;
report severe errors, such as hardware errors and inconsistent or illegal values in system tables.&lt;/p&gt;

&lt;h2 id=&#34;triggering-a-gp-exception&#34;&gt;Triggering a #GP exception&lt;/h2&gt;

&lt;p&gt;exception_GP_trigger.S&lt;/p&gt;

&lt;h2 id=&#34;exeception-init&#34;&gt;Exeception init&lt;/h2&gt;

&lt;p&gt;Rleated code:&lt;br /&gt;
idt_setup_early_traps           #===&amp;gt; idt_table: ist=0; DB, BP&lt;br /&gt;
idt_setup_early_pf              #===&amp;gt; idt_table: PF ist=0;&lt;br /&gt;
trap_init, idt_setup_traps                 #===&amp;gt; idt_table: ist=0; DE, 0x80 &amp;hellip; etc.&lt;br /&gt;
trap_init-&amp;gt;cpu_init, idt_setup_ist_traps             #===&amp;gt; idt_table: ist=1; DB, NMI, BP, DF, MC;&lt;br /&gt;
x86_init.irqs.trap_init         #===&amp;gt; if !KVM, noop&lt;br /&gt;
idt_setup_debugidt_traps        #===&amp;gt; debug_idt_table, check debug stack; INTG; #DB debug; #BP int; check arch/x86/entry/entry_64.S&lt;/p&gt;

&lt;h1 id=&#34;interrupt&#34;&gt;Interrupt&lt;/h1&gt;

&lt;p&gt;If interrupt occured in user mode, then cpu will context swith for potential reschedule.&lt;br /&gt;
The Interrupt Descriptor Table (IDT) is a data structure used by the x86 architecture to implement an interrupt vector table.&lt;/p&gt;

&lt;h2 id=&#34;hardware-interrupts&#34;&gt;Hardware interrupts&lt;/h2&gt;

&lt;p&gt;are used by devices to communicate that they require attention from the operating system.&lt;br /&gt;
more details in init_IRQ() or set_irq() in driver.&lt;/p&gt;

&lt;h2 id=&#34;software-interrupt&#34;&gt;software interrupt&lt;/h2&gt;

&lt;p&gt;more details in trap_init().&lt;br /&gt;
* exception or trap&lt;br /&gt;
is caused either by an exceptional condition in the processor itself,&lt;br /&gt;
divide zero painc?&lt;br /&gt;
* special instruction, for example INT 0x80&lt;br /&gt;
or a special instruction in the instruction set which causes an interrupt when it is executed.&lt;/p&gt;

&lt;h2 id=&#34;irq-line-number-vs-interrupt-vector&#34;&gt;IRQ line number vs interrupt vector&lt;/h2&gt;

&lt;p&gt;cat /proc/interrupts&lt;br /&gt;
            CPU0       CPU1       CPU2       CPU3&lt;br /&gt;
   0:         21          0          0          0  IR-IO-APIC    2-edge      timer&lt;br /&gt;
v3a Chapter 6 and Check ULK3 Chapter 4 Interrupt vectors&lt;br /&gt;
the 0 in /proc/interrupts is a IRQ line number&lt;br /&gt;
The 0 for Divide error is a interrupt vector.&lt;/p&gt;

&lt;h2 id=&#34;interrupt-init&#34;&gt;Interrupt init&lt;/h2&gt;

&lt;p&gt;early_irq_init = alloc NR_IRQS_LEGACY irq_desc; - 16    #===&amp;gt; [    0.000000] NR_IRQS: 65792, nr_irqs: 1024, preallocated irqs: 16&lt;br /&gt;
init_IRQ()-&amp;gt;x86_init.irqs.intr_init=native_init_IRQ     #===&amp;gt; external interrupt init;&lt;br /&gt;
    pre_vector_init = init_ISA_irqs #===&amp;gt; 1) legacy_pic-&amp;gt;init(0); init 8259a; 2) link irq_desc in irq_desc_tree with flow handle and chip.&lt;br /&gt;
    idt_setup_apic_and_irq_gates    #===&amp;gt; apic normal(from 32) and system interrupts;&lt;/p&gt;

&lt;h2 id=&#34;affinity&#34;&gt;affinity&lt;/h2&gt;

&lt;p&gt;root@snow:/tmp# cat x.sh&lt;br /&gt;
echo 1 &amp;gt; /proc/irq/129/smp_affinity&lt;br /&gt;
sudo trace-cmd record -p function_graph &amp;ndash;max-graph-depth 70 -g __irq_set_affinity -c -F  ./x.sh&lt;br /&gt;
__irq_set_affinity msi_domain_set_affinity intel_ir_set_affinity apic_set_affinity&lt;/p&gt;

&lt;p&gt;interrupt balancing&lt;br /&gt;
Interrupts not distributed as specified in smp_affinity: &lt;a href=&#34;https://www.suse.com/support/kb/doc/?id=000018837&#34;&gt;https://www.suse.com/support/kb/doc/?id=000018837&lt;/a&gt;&lt;br /&gt;
De-mystifying interrupt balancing: irqbalance: &lt;a href=&#34;https://www.youtube.com/watch?v=hjMWVrqrt2U&#34;&gt;https://www.youtube.com/watch?v=hjMWVrqrt2U&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;ipi&#34;&gt;IPI&lt;/h1&gt;

&lt;p&gt;commit 52aec3308db85f4e9f5c8b9f5dc4fbd0138c6fa4&lt;br /&gt;
Author: Alex Shi &lt;a href=&#34;mailto:alex.shi@intel.com&#34;&gt;alex.shi@intel.com&lt;/a&gt;&lt;br /&gt;
Date:   Thu Jun 28 09:02:23 2012 +0800&lt;br /&gt;
    x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR&lt;br /&gt;
ERROR_APIC_VECTOR               0xfe&lt;br /&gt;
RESCHEDULE_VECTOR               0xfd&lt;br /&gt;
CALL_FUNCTION_VECTOR            0xfc&lt;br /&gt;
CALL_FUNCTION_SINGLE_VECTOR     0xfb&lt;br /&gt;
THERMAL_APIC_VECTOR             0xfa&lt;br /&gt;
THRESHOLD_APIC_VECTOR           0xf9&lt;br /&gt;
REBOOT_VECTOR                   0xf8&lt;/p&gt;

&lt;h1 id=&#34;history&#34;&gt;History&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://people.cs.clemson.edu/~mark/interrupts.html&#34;&gt;history of interrupts&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://virtualirfan.com/history-of-interrupts&#34;&gt;Another History of interrupts with video&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Scheduling in operating system</title>
      <link>http://firoyang.org/cs/sched_/</link>
      <pubDate>Wed, 29 Mar 2017 10:49:04 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/cs/sched_/</guid>
      <description>

&lt;h1 id=&#34;scheduling&#34;&gt;scheduling&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/Scheduling_(computing)&#34;&gt;Scheduling (computing)&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;context-switch&#34;&gt;Context switch&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://www.maizure.org/projects/evolution_x86_context_switch_linux/index.html&#34;&gt;Evolution of the x86 context switch in Linux&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://lwn.net/Articles/520227/&#34;&gt;Al Viro&amp;rsquo;s new execve/kernel_thread design&lt;/a&gt;&lt;br /&gt;
commit 0100301bfdf56a2a370c7157b5ab0fbf9313e1cd&lt;br /&gt;
Author: Brian Gerst &lt;a href=&#34;mailto:brgerst@gmail.com&#34;&gt;brgerst@gmail.com&lt;/a&gt;&lt;br /&gt;
Date:   Sat Aug 13 12:38:19 2016 -0400&lt;br /&gt;
    sched/x86: Rewrite the switch_to() code&lt;br /&gt;
&lt;a href=&#34;https://stackoverflow.com/questions/15019986/why-does-switch-to-use-pushjmpret-to-change-eip-instead-of-jmp-directly/15024312&#34;&gt;Why does switch_to use push+jmp+ret to change EIP, instead of jmp directly?&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;reference&#34;&gt;Reference&lt;/h1&gt;

&lt;p&gt;Process scheduling in Linux &amp;ndash; Volker Seeker from University of Edinburgh&lt;br /&gt;
&lt;a href=&#34;https://tampub.uta.fi/bitstream/handle/10024/96864/GRADU-1428493916.pdf&#34;&gt;A complete guide to Linux process scheduling&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://www.kernel.org/doc/Documentation/scheduler/sched-design-CFS.txt&#34;&gt;https://www.kernel.org/doc/Documentation/scheduler/sched-design-CFS.txt&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://helix979.github.io/jkoo/post/os-scheduler/&#34;&gt;JINKYU KOO&amp;rsquo;s Linux kernel scheduler&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&#34;http://www.joelfernandes.org/linuxinternals/2016/03/20/tif-need-resched-why-is-it-needed.html&#34;&gt;TIF_NEED_RESCHED: why is it needed&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;latency&#34;&gt;Latency&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://lwn.net/Articles/404993/&#34;&gt;Improving scheduler latency&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;general-runqueues&#34;&gt;General runqueues&lt;/h1&gt;

&lt;p&gt;static DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);&lt;br /&gt;
activate_task - move a task to the runqueue.&lt;br /&gt;
wake_up_new_task ttwu_do_activate&lt;/p&gt;

&lt;h2 id=&#34;on-rq&#34;&gt;on_rq&lt;/h2&gt;

&lt;p&gt;commit fd2f4419b4cbe8fe90796df9617c355762afd6a4&lt;br /&gt;
Author: Peter Zijlstra &lt;a href=&#34;mailto:a.p.zijlstra@chello.nl&#34;&gt;a.p.zijlstra@chello.nl&lt;/a&gt;&lt;br /&gt;
Date:   Tue Apr 5 17:23:44 2011 +0200&lt;br /&gt;
    sched: Provide p-&amp;gt;on_rq&lt;/p&gt;

&lt;h1 id=&#34;cfs-core-codes&#34;&gt;CFS core codes&lt;/h1&gt;

&lt;p&gt;git log 20b8a59f2461e&lt;br /&gt;
sched_create_group -&amp;gt; alloc_fair_sched_group -&amp;gt; init_tg_cfs_entry&lt;/p&gt;

&lt;h1 id=&#34;wake-up&#34;&gt;Wake up&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://lkml.org/lkml/2015/4/19/111&#34;&gt;sched: lockless wake-queues&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://www.youtube.com/watch?v=-8c47dHuGIY&#34;&gt;Futex Scaling for Multi-core Systems&lt;/a&gt;&lt;a href=&#34;https://www.slideshare.net/davidlohr/futex-scaling-for-multicore-systems&#34;&gt;Slides&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;program-order-guarantees&#34;&gt;Program-Order guarantees&lt;/h1&gt;

&lt;p&gt;commit 8643cda549ca49a403160892db68504569ac9052&lt;br /&gt;
Author: Peter Zijlstra &lt;a href=&#34;mailto:peterz@infradead.org&#34;&gt;peterz@infradead.org&lt;/a&gt;&lt;br /&gt;
Date:   Tue Nov 17 19:01:11 2015 +0100&lt;br /&gt;
    sched/core, locking: Document Program-Order guarantees&lt;/p&gt;

&lt;h2 id=&#34;lkml-discussions&#34;&gt;LKML discussions&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;https://lkml.org/lkml/2015/11/2/311&#34;&gt;scheduler ordering bits&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://lkml.org/lkml/2015/12/3/323&#34;&gt;scheduler ordering bits -v2&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;pi-lock&#34;&gt;pi_lock&lt;/h2&gt;

&lt;p&gt;commit b29739f902ee76a05493fb7d2303490fc75364f4&lt;br /&gt;
Author: Ingo Molnar &lt;a href=&#34;mailto:mingo@elte.hu&#34;&gt;mingo@elte.hu&lt;/a&gt;&lt;br /&gt;
Date:   Tue Jun 27 02:54:51 2006 -0700&lt;br /&gt;
    [PATCH] pi-futex: scheduler support for pi&lt;br /&gt;
    Add framework to boost/unboost the priority of RT tasks.&lt;/p&gt;

&lt;h1 id=&#34;rq-lock-in-schedule-and-context-switch&#34;&gt;rq-&amp;gt;lock in schedule() and context_switch()&lt;/h1&gt;

&lt;p&gt;commit 3a5f5e488ceee9e08df3dff3f01b12fafc9e7e68&lt;br /&gt;
Author: Ingo Molnar &lt;a href=&#34;mailto:mingo@elte.hu&#34;&gt;mingo@elte.hu&lt;/a&gt;&lt;br /&gt;
Date:   Fri Jul 14 00:24:27 2006 -0700&lt;br /&gt;
    [PATCH] lockdep: core, fix rq-lock handling on __ARCH_WANT_UNLOCKED_CTXSW&lt;br /&gt;
+        * Since the runqueue lock will be released by the next&lt;br /&gt;
+        * task&lt;/p&gt;

&lt;h1 id=&#34;running-compensator-records-the-running-process&#34;&gt;Running Compensator records the running process&lt;/h1&gt;

&lt;p&gt;scheduler_tick&lt;br /&gt;
{&lt;br /&gt;
    update_rq_clock&lt;br /&gt;
    task_tick_fair -&amp;gt; entity_tick&lt;br /&gt;
    {&lt;br /&gt;
        update_curr&lt;br /&gt;
        {&lt;br /&gt;
            sum_exec_runtime - total runtime&lt;br /&gt;
            cfs_rq-&amp;gt;exec_clock - cfs_rq runtime&lt;br /&gt;
            vruntime    - inverse proportion to the weight or priority&lt;br /&gt;
            update_min_vruntime&lt;br /&gt;
            {&lt;br /&gt;
                cfs_rq-&amp;gt;curr, leftmost, min_vruntime, who is min?&lt;br /&gt;
            }&lt;br /&gt;
            cpuacct - cpu sys/user time&lt;br /&gt;
        }&lt;br /&gt;
    }&lt;br /&gt;
}&lt;/p&gt;

&lt;h1 id=&#34;next-pick-next-task-fair&#34;&gt;Next -&amp;gt; pick_next_task_fair&lt;/h1&gt;

&lt;p&gt;put_prev_entity: update_curr; insert into rb-tree;&lt;br /&gt;
pick_next_entity: left most of rb-tree.&lt;br /&gt;
set_next_entity: remove next from tree since it will disturb inserting and deleting when it is being updated.&lt;/p&gt;

&lt;h1 id=&#34;unrunnable&#34;&gt;Unrunnable&lt;/h1&gt;

&lt;p&gt;dequeue_task&lt;/p&gt;

&lt;h1 id=&#34;resuming&#34;&gt;Resuming&lt;/h1&gt;

&lt;p&gt;try_to_wake_up-&amp;gt;ttwu_queue-&amp;gt;ttwu_do_activate-&amp;gt; or local wakeup: schedule-&amp;gt;try_to_wake_up_local-&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
    ttwu_activate               #=== speical compensation and enqueue rq&lt;br /&gt;
    {&lt;br /&gt;
        activate_task&lt;br /&gt;
        p-&amp;gt;on_rq = TASK_ON_RQ_QUEUED    #=== 1) rq for task; 2)&lt;br /&gt;
    }&lt;br /&gt;
    ttwu_do_wakeup              #=== normal compensation&lt;br /&gt;
    {&lt;br /&gt;
        check_preempt_curr&lt;br /&gt;
        p-&amp;gt;state = TASK_RUNNING;&lt;br /&gt;
    }&lt;br /&gt;
}&lt;/p&gt;

&lt;p&gt;enqueue_task-&amp;gt; place_entity compensation for wakeup process&lt;/p&gt;

&lt;h2 id=&#34;wake-up-a-sleep-task&#34;&gt;wake up a sleep task&lt;/h2&gt;

&lt;pre&gt;&lt;code&gt;se-&amp;gt;on_rq &amp;amp; TASK_ON_RQ_QUEUED; deactivate_task set on_rq to 0;
enqueue_task_fair handles group stuff
enqueue_entity deals with sched_entity - uptodate the vruntime, load average, account load numa perfering,
sysctl_sched_latency: the cfs pledge to the pre-existing tasks that they have 6ms to run before new task to run.
try_to_wake_up_local for local task
try_to_wake_up for any task
&lt;/code&gt;&lt;/pre&gt;

&lt;h1 id=&#34;new-task&#34;&gt;New task&lt;/h1&gt;

&lt;p&gt;speical debit compensation: sched_fork-&amp;gt;task_fork_fair-&amp;gt;place_entity - compensation for new process&lt;br /&gt;
normal compensation: wake_up_new_task&lt;br /&gt;
{&lt;br /&gt;
    activate_task               #=== speical compensation&lt;br /&gt;
    check_preempt_curr          #=== normal compensation&lt;br /&gt;
}&lt;/p&gt;

&lt;h1 id=&#34;priority&#34;&gt;Priority&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;weight&lt;br /&gt;&lt;/li&gt;
&lt;li&gt;priority&lt;br /&gt;
DEFAULT_PRIO&lt;br /&gt;
fs/proc/array.c&lt;br /&gt;
&lt;br /&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&#34;latency-1&#34;&gt;Latency&lt;/h1&gt;

&lt;ol&gt;
&lt;li&gt;sched_nr_latency= /proc/sys/kernel/sched_latency_ns / /proc/sys/kernel/sched_min_granularity_ns&lt;br /&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;if running process &amp;gt; sched_nr_latency, latency cannot be ensured. just focus on min granularity&lt;/p&gt;

&lt;h2 id=&#34;lqo&#34;&gt;LQO&lt;/h2&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;is the difference of leftmost and rightmost smaller than sched_min_granularity_ns??&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;sched_slice&lt;/p&gt;

&lt;h1 id=&#34;energy&#34;&gt;Energy&lt;/h1&gt;

&lt;p&gt;blocked &amp;amp; schedule&lt;br /&gt;
check preempt &amp;amp; schedule&lt;br /&gt;
check_preempt_tick              # new preempts curr&lt;br /&gt;
{&lt;br /&gt;
curr running time &amp;gt; sched_slice     # enough time to yield.&lt;br /&gt;
curr - leftmost &amp;gt; sched_slice       # nice to others.&lt;br /&gt;
}&lt;br /&gt;
check_preempt_wakeup                # the wakeuped preempts curr&lt;br /&gt;
{&lt;br /&gt;
curr - wakeuped &amp;gt; sysctl_sched_wakeup_granularity;  # pass the wakeup-preempt-delay&lt;br /&gt;
}&lt;/p&gt;

&lt;h1 id=&#34;io-wait&#34;&gt;io wait&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://lwn.net/Articles/342378/&#34;&gt;https://lwn.net/Articles/342378/&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;load-avg&#34;&gt;Load avg&lt;/h1&gt;

&lt;p&gt;update_load&lt;em&gt;avg&lt;br /&gt;
&lt;a href=&#34;https://en.wikipedia.org/wiki/Load&#34;&gt;https://en.wikipedia.org/wiki/Load&lt;/a&gt;&lt;/em&gt;(computing)&lt;br /&gt;
Check External links&lt;br /&gt;
calc_load_fold_active&lt;br /&gt;
Etymology of avenrun: &lt;a href=&#34;https://elixir.bootlin.com/linux/v4.1/source/arch/s390/appldata/appldata_os.c&#34;&gt;average nr. of running processes during&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&#34;lqo-1&#34;&gt;LQO&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://lwn.net/Articles/136065/&#34;&gt;improve SMP reschedule and idle routines&lt;/a&gt;&lt;br /&gt;
TIF_POLLING_NRFLAG -&amp;gt; Need-Resched-Flag?&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;process migration&lt;br /&gt;
e761b7725234276a802322549cee5255305a0930&lt;br /&gt;
Introduce cpu_active_map and redo sched domain managment&lt;br /&gt;
When to migration&lt;br /&gt;
    sched_setaffinity __set_cpus_allowed_ptr manuly&lt;br /&gt;
    Selecting a new CPU during wak up a sleeper&lt;br /&gt;
    For balancing, selecting CPU during  wake up new process in _do_fork&lt;br /&gt;
    execve&amp;rsquo;s sched_exec&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;shceduler clock&lt;br /&gt;
rq-&amp;gt;clock is nano seconds?&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;clock_task and wraps&lt;br /&gt;
fe44d62122829959e960bc699318d58966922a69&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;START_DEBIT&lt;br /&gt;
no standalone commit&lt;br /&gt;
bf0f6f24a1ece8988b243aefe84ee613099a9245&lt;br /&gt;&lt;/li&gt;
&lt;li&gt;why ahead?&lt;br /&gt;
8 /*&lt;br /&gt;
9  * Place new tasks ahead so that they do not starve already running&lt;br /&gt;
10  * tasks&lt;br /&gt;
11  */&lt;br /&gt;
12 SCHED_FEAT(START_DEBIT, true)&lt;br /&gt;
the tree is named timeline&lt;br /&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://lwn.net/Articles/404993/&#34;&gt;Improving scheduler latency &lt;/a&gt;&lt;br /&gt;&lt;/li&gt;
&lt;li&gt;skip next last buddy&lt;br /&gt;
&lt;br /&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&#34;load-balancing&#34;&gt;Load balancing&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://www.kernel.org/doc/html/latest/_sources/scheduler/sched-domains.rst.txt&#34;&gt;https://www.kernel.org/doc/html/latest/_sources/scheduler/sched-domains.rst.txt&lt;/a&gt;&lt;br /&gt;
&lt;a href=&#34;https://lwn.net/Articles/80911/&#34;&gt;Scheduling domains&lt;/a&gt;&lt;br /&gt;
sched_init_smp-&amp;gt;&lt;br /&gt;
sched_init_domains or init_sched_domains build_sched_domains&lt;br /&gt;
&lt;strong&gt;visit_domain_allocation_hell()-&amp;gt;&lt;/strong&gt;sdt_alloc() alloc the sdd-&amp;gt;sg which is used by build groups&lt;br /&gt;
and sg = kzalloc_node(sizeof(struct sched_group) + cpumask_size(); it covered the size of cpumask&lt;br /&gt;
/* Build the groups for the domains */&lt;br /&gt;
detach_destroy_domains&lt;br /&gt;
cpu_attach_domain&lt;/p&gt;

&lt;p&gt;CONFIG_SCHED_MC=y&lt;br /&gt;
static noinline struct sched_domain *                                   &lt;br /&gt;
sd&lt;em&gt;init&lt;/em&gt;##type(struct sched_domain_topology_level *tl, int cpu)         &lt;br /&gt;
{                                                                       &lt;br /&gt;
        struct sched_domain *sd = *per_cpu&lt;em&gt;ptr(tl-&amp;gt;data.sd, cpu);       &lt;br /&gt;
        *sd = SD&lt;/em&gt;##type##_INIT;                                         &lt;br /&gt;
        SD_INIT_NAME(sd, type);                                         &lt;br /&gt;
        sd-&amp;gt;private = &amp;amp;tl-&amp;gt;data;                                        &lt;br /&gt;
        return sd;                                                      &lt;br /&gt;
}&lt;br /&gt;
sched_domain_topology_level default_topology&lt;/p&gt;

&lt;h1 id=&#34;throttling-entities&#34;&gt;Throttling entities&lt;/h1&gt;

&lt;p&gt;commit 85dac906bec3bb41bfaa7ccaa65c4706de5cfdf8&lt;br /&gt;
Author: Paul Turner &lt;a href=&#34;mailto:pjt@google.com&#34;&gt;pjt@google.com&lt;/a&gt;&lt;br /&gt;
Date:   Thu Jul 21 09:43:33 2011 -0700&lt;br /&gt;
    sched: Add support for throttling group entities&lt;br /&gt;
    Now that consumption is tracked (via update_curr()) we add support to throttle&lt;br /&gt;
    group entities (and their corresponding cfs_rqs) in the case where this is no&lt;br /&gt;
    run-time remaining.&lt;/p&gt;

&lt;h1 id=&#34;load-tracking-pelt&#34;&gt;Load tracking - PELT&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;https://lwn.net/Articles/531853/&#34;&gt;Per-entity load tracking&lt;/a&gt;&lt;br /&gt;
commit 5b51f2f80b3b906ce59bd4dce6eca3c7f34cb1b9&lt;br /&gt;
Author: Paul Turner &lt;a href=&#34;mailto:pjt@google.com&#34;&gt;pjt@google.com&lt;/a&gt;&lt;br /&gt;
Date:   Thu Oct 4 13:18:32 2012 +0200&lt;br /&gt;
    sched: Make __update_entity_runnable_avg() fast&lt;br /&gt;
commit a481db34b9beb7a9647c23f2320dd38a2b1d681f&lt;br /&gt;
Refs: v4.11-rc2-229-ga481db34b9be&lt;br /&gt;
Author:     Yuyang Du &lt;a href=&#34;mailto:yuyang.du@intel.com&#34;&gt;yuyang.du@intel.com&lt;/a&gt;&lt;br /&gt;
AuthorDate: Mon Feb 13 05:44:23 2017 +0800&lt;br /&gt;
    sched/fair: Optimize ___update_sched_avg()&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>杂记于北苑5号院</title>
      <link>http://firoyang.org/life/z2/</link>
      <pubDate>Sat, 26 Nov 2016 20:54:11 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/life/z2/</guid>
      <description>&lt;p&gt;我这些年的努力, 难道不是在反抗国家和社会的压迫吗?&lt;br /&gt;
更像是现代版的个人主义摩西.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>荒野中的人</title>
      <link>http://firoyang.org/philosophy/the_wilds/</link>
      <pubDate>Sun, 12 Jun 2016 20:23:11 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/philosophy/the_wilds/</guid>
      <description>&lt;p&gt;国家是所有怪物中最为冷酷的, 冷酷的谎言从头的嘴里爬出来:&amp;ldquo;我, 这个国家, 就是全体人民.&lt;br /&gt;
                                ———— 尼采 《新的偶像》&lt;/p&gt;

&lt;p&gt;一个人在降生之前, 他的同类, 就以国家这个怪物的名义瓜分了这个他所赖以生存世界,&lt;br /&gt;
没有选择的余地. 是的, 无论你是否愿意, 你都将成为国家的一部分. 同时, 这也是&lt;br /&gt;
个人乃至整个人类族群的悲剧的起源!&lt;/p&gt;

&lt;p&gt;人们不可避免的成为了社会的一个螺丝钉, 你要生存就必须参与进来.&lt;br /&gt;
生活在所谓社会中的个体, 又是怎样的? 普遍特征就是放弃思考!&lt;br /&gt;
在面对历经数千年历史形成的庞大社会体系, 对于个人来说, 了解适应&lt;br /&gt;
这个社会占据了他的一生. 只要少数人不断常识改变, 革新人们所处的生活&lt;br /&gt;
牢笼, 如卢梭, 孟德斯鸠, 尼采, 马克思等等.&lt;/p&gt;

&lt;p&gt;我发现, 所处社会中绝大多数人, 都觉得他的being, 以及围绕他的一切都是理所当然.&lt;br /&gt;
所有的人, 都在顺从着社会的意志, 成为社会的奴隶. 而社会中的绝大多数人成为了一小部分&lt;br /&gt;
人的奴隶.&lt;/p&gt;

&lt;p&gt;To be continue.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>杂记于蓝星花园</title>
      <link>http://firoyang.org/life/z1/</link>
      <pubDate>Sat, 27 Feb 2016 22:12:11 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/life/z1/</guid>
      <description>&lt;p&gt;我知道有很多事我的去做, 但是又不知如何开始, 这就是现在的我.&lt;br /&gt;
我之所以这样, 很可能因为, 之前走过太多弯路, 现在害怕在走上那样的路.&lt;/p&gt;

&lt;p&gt;如何开始着手呢? 算法用不用看? kernel 用不用看?&lt;br /&gt;
如何衡量某个内容是否值得花时间呢?&lt;br /&gt;
至少要分出主次来.&lt;br /&gt;
首先要推理出所有知识的关系图, 这是第一步, 也就是先确定所谓的框架.&lt;br /&gt;
之后, 在这个框架上, 挑选重要的, 可行的学习.&lt;br /&gt;
这么看来和学习英语很像. 正所谓万变不离其宗.&lt;/p&gt;

&lt;p&gt;May 1 2015&lt;br /&gt;
我今天建立了关于Computer science 知识的Architecture.&lt;br /&gt;
我需要抽空写一篇文章完整的阐述这个Architecture.我也总算是对计算机科学有个&lt;br /&gt;
交代了.虽然框架是有了, 但我还是不知道怎么来填充它, 以及在填充的过程中如何&lt;br /&gt;
和实际的实践联系上.&lt;/p&gt;

&lt;p&gt;实际上, 只要沿着architecture的逻辑链条逐次递归的需找sub-architecture就可以了.&lt;br /&gt;
虽然, 依然不能很好的把握和实际实践联系. 但至少可以明确, 寻找sub-architecture,&lt;br /&gt;
这件事也是在日程上的需要去做.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>The common cognition language</title>
      <link>http://firoyang.org/language/ccl/</link>
      <pubDate>Sun, 13 Dec 2015 03:27:16 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/language/ccl/</guid>
      <description>&lt;p&gt;What&amp;rsquo;s the orgnaiszations of knowledge?&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Console and TTY</title>
      <link>http://firoyang.org/dark_ages/console/</link>
      <pubDate>Sat, 05 Dec 2015 14:06:29 CST</pubDate>
      <author>Firo Yang</author>
      <guid>http://firoyang.org/dark_ages/console/</guid>
      <description>

&lt;p&gt;343 line&lt;br /&gt;
1. /dev/console 指向正在运行的tty 和tty0 一样ttyN not pts, console 佬变.&lt;br /&gt;
2. /dev/tty 一直指向所在的那个不变.&lt;/p&gt;

&lt;h1 id=&#34;about-the-design&#34;&gt;About the design&lt;/h1&gt;

&lt;p&gt;Why dose we use /dev/xxx to represent the &amp;ldquo;tty&amp;rdquo; device?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The whole point with &amp;ldquo;everything is a file&amp;rdquo; is not that you have some random filename (indeed, sockets and pipes show that &amp;ldquo;file&amp;rdquo; and &amp;ldquo;filename&amp;rdquo;&lt;br /&gt;
have nothing to do with each other), but the fact that you can use common tools to operate on different things. &amp;ndash; Linus&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So we got the key point!&lt;br /&gt;
In order to use the common tools, file ops and vfs layer, the tty device is &amp;ldquo;abstructed&amp;rdquo; to&lt;br /&gt;
files by us. Addnationaly, we must assurance that is the files is &lt;em&gt;different&lt;/em&gt;. What does&lt;br /&gt;
the word &amp;ldquo;dirrerent&amp;rdquo; means is not that you have some random different filename, but the&lt;br /&gt;
fact that you can access the real device through the different.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Wikipedia:&lt;br /&gt;
In mathematics, injections, surjections and bijections are classes of functions distinguished by the manner in which arguments (input expression&amp;gt; s from the domain) and images (output expressions from the codomain) are related or mapped to each other.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I got an insight that abstruction is a non-injective, right?&lt;br /&gt;
But non-injective may not be a anstruction.&lt;br /&gt;
An asbstruction should come from manipulating different objects.&lt;br /&gt;
Non-injective, 多对一; Multiplex, 一对多.&lt;br /&gt;
So we can use mathematical language to describe the linux subsystem.&lt;br /&gt;
From real life device to a filesystem file.&lt;br /&gt;
* Abstruction: Non-injective, Multiplex(not partial function).&lt;br /&gt;
* Jection num: injective, or non-jective, or multiplex; Jection level: domain set and codoain set!&lt;br /&gt;
Abstruction -&amp;gt; Control abstruction and data abstruction -&amp;gt; Abstruction layer&lt;/p&gt;

&lt;p&gt;软件设计的两个主要目的.&lt;br /&gt;
1. 简化, 易用,易操作, 如computer architecture, tty subsystem. 抽象.&lt;br /&gt;
2. Separation of concerns, modular(abstruciton in layer), 笛卡尔划分的思想&lt;/p&gt;

&lt;h1 id=&#34;tty-struct-disc-data&#34;&gt;tty_struct-&amp;gt;disc_data&lt;/h1&gt;

&lt;p&gt;tty_init_dev-&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
    initialize_tty_struct-&amp;gt;tty_ldisc_init&lt;br /&gt;
    tty_ldisc_setup-&amp;gt;tty_ldisc_open-&amp;gt;n_tty_open-&amp;gt; tty-&amp;gt;disc_data = ldata;&lt;br /&gt;
}&lt;br /&gt;
sys_vhangup-&amp;gt;tty_vhangup_self-&amp;gt;__tty_hangup-&amp;gt;tty_ldisc_hangup-&amp;gt;tty_ldisc_reinit&lt;br /&gt;
vfs_write-&amp;gt;redirected_tty_write-&amp;gt;tty_write-&amp;gt;n_tty_write-&amp;gt;process_output&lt;/p&gt;

&lt;h1 id=&#34;early-con&#34;&gt;early_con&lt;/h1&gt;

&lt;p&gt;EARLYCON_DECLARE(uart8250, early_serial8250_setup); EARLYCON_DECLARE(uart, early_serial8250_setup);&lt;br /&gt;
setup_earlycon-&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
    parse_options-&amp;gt;&lt;br /&gt;
    {&lt;br /&gt;
        parse earlycon_device-&amp;gt;port-&amp;gt;uartclk and&lt;br /&gt;
        earlycon_device-&amp;gt;baud&lt;br /&gt;
    }&lt;br /&gt;
    setup = early_serial8250_setup-&amp;gt; init_port(device);&lt;br /&gt;
    register_console(early_console_dev.con)&lt;br /&gt;
}&lt;/p&gt;

&lt;h1 id=&#34;earlyprintk&#34;&gt;earlyprintk&lt;/h1&gt;

&lt;p&gt;early_param(&amp;ldquo;earlyprintk&amp;rdquo;, setup_early_printk)-&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
    early_serial_init-&amp;gt;&lt;br /&gt;
    {&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;}
early_console_register(&amp;amp;early_serial_console, keep);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;}&lt;/p&gt;

&lt;h1 id=&#34;cpu-hotplug&#34;&gt;cpu hotplug&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;onset&lt;br /&gt;
static struct smp_hotplug_thread softirq_threads {&lt;br /&gt;
.thread_fn              = run_ksoftirqd&lt;br /&gt;
};&lt;br /&gt;
early_initcall(spawn_ksoftirqd)-&amp;gt;mpboot_register_percpu_thread(&amp;amp;softirq_threads)&lt;br /&gt;&lt;/li&gt;
&lt;li&gt;nucles onset&lt;br /&gt;
suspend_enter-&amp;gt;enable_nonboot_cpus-&amp;gt;_cpu_up-&amp;gt;smpboot_create_threads-&amp;gt;&lt;strong&gt;smpboot_create_thread(&amp;amp;hotplug_threads)-&amp;gt;&lt;br /&gt;
kthread_create_on_cpu(smpboot_thread_fn-&amp;gt; ht-&amp;gt;thread_fn(td-&amp;gt;cpu)= run_ksoftirqd-&amp;gt;&lt;/strong&gt;do_softirq-&amp;gt;h-&amp;gt;action(h) = run_timer_softirq-&amp;gt;&lt;br /&gt;
__run_timers-&amp;gt;call_timer_fn)&lt;br /&gt;
&lt;br /&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;vt 就是个tty也有tty_driver 叫console_driver.&lt;/p&gt;

&lt;h1 id=&#34;reference&#34;&gt;Reference&lt;/h1&gt;

&lt;h1 id=&#34;contents&#34;&gt;Contents&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;http://superuser.com/questions/144666/what-is-the-difference-between-shell-console-and-terminal&#34;&gt;What is the difference between shell, console, and terminal?&lt;/a&gt;&lt;br /&gt;
What does console do in kernel or u-boot?&lt;br /&gt;
Linux console?&lt;br /&gt;
Computer terminal: keyboard + dispaly&lt;br /&gt;
Terminal emulator:&lt;/p&gt;

&lt;h1 id=&#34;get-a-glance-on-u-boot&#34;&gt;get a glance on u-boot&lt;/h1&gt;

&lt;p&gt;start.S:board_init_r&lt;br /&gt;
init_sequence_f: -&amp;gt;init_baud_rate; serial_init; console_init_f&lt;br /&gt;
serial_init -&amp;gt;&amp;amp;eserial1_device-&amp;gt;start=eserial##port##_init-&amp;gt;NS16550_init: UART divisor init.&lt;br /&gt;
Firstly, the struct of serial define some input and output funtion.&lt;br /&gt;
It&amp;rsquo;s basic function of serial, put or get!&lt;br /&gt;
console_init_f: just gd-&amp;gt;have_console = 1;&lt;br /&gt;
init_sequence_r: stdio_init_tables,initr_serial, stdio_add_devices, console_init_r,&lt;br /&gt;
initr_serial: just register &amp;amp;eserial1_device to serial_devices&lt;br /&gt;
stdio_add_devices: drv_system_init, serial_stdio_init&lt;br /&gt;
drv_system_init: register default serial dev to devs.list.&lt;br /&gt;
serial_stdio_init: register &amp;amp;eserial1_device to devs.list. Duplicate, but serial dev &amp;ldquo;eserial0&amp;rdquo;  and system &amp;ldquo;serial&amp;rdquo;.&lt;br /&gt;
console_init_r: console_doenv -&amp;gt;console_setfile:stdio_devices[file(0/1/2)] = dev; actually, dev is &amp;ldquo;serial&amp;rdquo;, but they may be KBD!&lt;br /&gt;
看来console的真正作用就是在serial和kbd中选择一个, 可能多选iomux?&lt;br /&gt;
main_loop:cli_loop: getc!&lt;/p&gt;

&lt;h1 id=&#34;what-is-platform-device-or-driver&#34;&gt;what is platform device or driver?&lt;/h1&gt;

&lt;h1 id=&#34;a-reallife-serial8250&#34;&gt;A reallife serial8250&lt;/h1&gt;

&lt;p&gt;drivers/tty/serial/8250/8250_boca.c:plat_serial8250_port&lt;br /&gt;
module_init-&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
    serial8250_init-&amp;gt;serial8250_isa_init_ports-&amp;gt;serial8250_ports[i].port.ops = &amp;amp;serial8250_pops; //insidious&lt;br /&gt;
    boca_init-&amp;gt;platform_device_register(&amp;amp;boca_device); //register platform device and data.&lt;br /&gt;
}&lt;br /&gt;
* uart_port-&amp;gt;tty_port&lt;/p&gt;

&lt;p&gt;serial8250_probe(plat_serial8250_port)-&amp;gt;serial8250_register_8250_port(uart_8250_port)-&amp;gt;&lt;br /&gt;
uart_add_one_port(&amp;amp;serial8250_reg, &amp;amp;uart-&amp;gt;port=uart_port)-&amp;gt;&lt;br /&gt;
{&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;uart_configure_port-&amp;gt;
{
    port-&amp;gt;ops-&amp;gt;config_port(port, flags)=serial8250_config_port
    if post is console and not registered; register!
    we know that uport-&amp;gt;cons = drv-&amp;gt;cons; what is the relation to registering about up-&amp;gt;cons.
    why we register it? where does drv-&amp;gt;cons come from?
    //这个uart_driver drv就是serial8250_reg, 我们也就知道了
    // console是设备的一种天生能力. 能否使用, 只关乎你是否想用, 就是配置相关的config SERIAL8250_CONSOLE
    // con_driver is the backends, vga, dummy/serial?, fb. 
}
tty_port_register_device_attr-&amp;gt;
{
    //tty_driver  tty_port
    tty_port_link_device
    tty_register_device_attr
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;}&lt;/p&gt;

&lt;h1 id=&#34;simple-conceptions&#34;&gt;Simple conceptions&lt;/h1&gt;

&lt;p&gt;&lt;a href=&#34;http://www.linusakesson.net/programming/tty/&#34;&gt;You must read this -&amp;gt; The TTY demystified&lt;/a&gt;&lt;br /&gt;
System consoles are generalized to computer terminals, which are abstracted respectively by virtual consoles and terminal emulators.&lt;br /&gt;
        UART&lt;br /&gt;
        Line displine + TTY(pts, dummy/serial, kbd+vga/fb)&lt;br /&gt;
* System console&lt;br /&gt;
Virtual termial, Terminal emulator/telnet/ssh -&amp;gt; pts ,Physical terminal&lt;br /&gt;
You need at least one virtual terminal device in order to make use of your keyboard and monitor.&lt;br /&gt;
VT combine keyboard and display see con_init&lt;br /&gt;
con_init init a virtual terminal like gnome-terminal but in kernel.&lt;br /&gt;
con_init mainly init display.&lt;br /&gt;
vty_init mainly init kbd&lt;br /&gt;
They all can be system console.(Exception pts??), if you enable it.&lt;br /&gt;
Console is the entry of linux system.&lt;br /&gt;
* Console driver &amp;ndash; banckends of  console&lt;br /&gt;
struct console 指定了console的结构.&lt;br /&gt;
* Console config&lt;br /&gt;
If I disable CONFIG_SERIAL8250_CONSOLE(enable vt console), then no booting log and I can not login system.&lt;br /&gt;
If I disable CONFIG_VT_CONSOLE(enable serail console, /dev/console point to ttyS0 see show_cons_active), no booting log but I can lgin system.&lt;br /&gt;
How to explain this phenomena?&lt;br /&gt;
From show_cons_active, we know /dev/console should come from console_drivers.&lt;br /&gt;
/dev/console is really the pointer.&lt;br /&gt;
Now, let&amp;rsquo;s inspect open /dev/console.&lt;br /&gt;
* Open /dev/console&lt;br /&gt;
Fisrt, it&amp;rsquo;s the very last place of booting kernel.&lt;br /&gt;
start_kernel-&amp;gt;rest_init-&amp;gt;kernel_init-&amp;gt;kernel_init_freeable-&amp;gt;sys_open((const char __user *) &amp;ldquo;/dev/console&amp;rdquo;, O_RDWR, 0)-&amp;gt;&amp;hellip;-&amp;gt;&lt;br /&gt;
console_fops-&amp;gt;tty_open-&amp;gt;&lt;br /&gt;
{&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;// Sitiuations: disbale VT CONSOLE byt enable SERIAL8250_CONSOLE.
// This function work only for /dev/tty
// ls -l /dev/tty
// crw-rw-rw- 1 root tty 5, 0 Dec 10 10:24 /dev/tty
tty_open_current_tty-&amp;gt;
{
    /dev/console is 5:1, just return NULL
}
//This index should be Ctrl + Alt + Fn??
// tty_struct is corresponding virtual console, or just console??
// lookup tty_driver. It looks like lookup a inode, right?
tty_lookup_driver-&amp;gt;
{
    // Find a tty_driver by device() in console_drivers.
    // So we know, got a console, then got a tty_driver, right?
    // Where do the components of console_drivers come form?
    // At present, we should have only serial8250_console because 
    // console vt_console_driver is disabled by us! But tty_driver console_driver
    // still do exist!So console and tty is really separated!
    // When we init vt, we get the tty_driver console_driver with con_ops type of tty_operations.
    // tty_drivers have the same major with /dev/tty0!
    // char_dev -&amp;gt; tty_driver int vty_init, not disable.
    // vt console---^
    console_device-&amp;gt;c-&amp;gt;device(c, index)
    // What about serial8250_console?
    // Where is the tty_driver of serial8250_console?
    // console-&amp;gt;data = uart_driver-&amp;gt;tty_driver.
    // We got another scene : uart-&amp;gt;tty
    // serial8250 console-&amp;gt;data-^
    // serial&#39;s tty_driver alloced in serial8250_init with uart_ops.
    // vt&#39;s tty_driver alloced in vty_init.
    // We summarize these:
    // uart_driver serial8250_reg &amp;lt;-&amp;gt;  vc dev or /dev/tty*
    // serial tty driver &amp;lt;-&amp;gt; vt tty driver 
    // fs:vty_init &amp;lt;-&amp;gt; module:serial8250_init
    // tty driver ops con_ops &amp;lt;-&amp;gt; uart_ops
    // vt use major to connect tty and  vc dev
    // serial use major and -&amp;gt; to connect tty and uart_driver
    // It seems that uart and tty has a strong relationship, yet vt.
    // Ok... we got tty driver.
    // If we disable vt console, then here is the serial8250 tty_driver.
}
// Lookup for tty_struct
tty_driver_lookup_tty -&amp;gt;
{
    //tty_struct is alloced in init function alloc_tty_driver.
    // ttys, termios, ports, cdevs.
    // ttys was used by tty_standard_install then tty_driver_install_tty 
    // then tty_init_dev then ok we return to tty open. So this is the start place.

    So we know tty_driver-&amp;gt;ttys[*] must be NULL.
}
tty_init_dev-&amp;gt;
{
    // tty_driver likes a process, ttys like the files, tty_struct like a file!
    // So we know a tty_struct is a tty file.
    // tty_driver much like a inode
    //So tty_struct-&amp;gt;ops = tty_driver-&amp;gt;ops = &amp;amp; uart_ops
    alloc_tty_struct-&amp;gt;tty-&amp;gt;ops = driver-&amp;gt;ops;
    tty_driver_install_tty(driver, tty_struct)-&amp;gt; tty_standard_install-&amp;gt;driver-&amp;gt;ttys[tty-&amp;gt;index] = tty;
}
// Ok... We got a tty_struct.
// Add /dev/console to tty_struct-&amp;gt;tty_files
tty_add_file
// At present, what have we done?
// open(/dev/console)-&amp;gt;console_drivers-&amp;gt;console-&amp;gt;tty_driver-&amp;gt;tty_struct, right?

tty-&amp;gt;ops-&amp;gt;open(tty, filp)-&amp;gt;//ops = &amp;amp;uart_ops
{
    uart_ops-&amp;gt;open = uart_open-&amp;gt;
    {
        uart-&amp;gt; tty struct
        struct uart_driver *drv = (struct uart_driver *)tty-&amp;gt;driver-&amp;gt;driver_state;
        struct uart_state *state = drv-&amp;gt;state + line; //uart_state
        tty-&amp;gt;driver_data = state;
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;}&lt;br /&gt;
// At present, we understand the flow of open /dev/console to serial console&lt;/p&gt;

&lt;h1 id=&#34;what-about-opening-dev-console-to-vt-console&#34;&gt;What about opening /dev/console to vt console&lt;/h1&gt;

&lt;p&gt;sys_open(/dev/console)-&amp;gt; &amp;hellip; tty_open -&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
    tty_lookup_driver-&amp;gt; get tty_driver=console_driver,&lt;br /&gt;
    Through the name console_driver, we know, vt tty_driver is the defaut driver of console!&lt;br /&gt;
    // 同时, 我们也应该知道所谓的kernel的system console 是你只要enable了先关的config CONFIG_VT_CONSOLE and CONFIG_SERIAL_8250_CONSOLE.&lt;br /&gt;
    // system console就有了, printk也就有了归处. In other words, you registered, you got printed&lt;br /&gt;
    // In theory, you should be able to input something, like sysrq, I maybe test tomorrow.&lt;br /&gt;
    // 那么在kernel_init中sys_open又是什么鬼呢?&lt;br /&gt;
    // dev/tty 是专门真对进程的, 就是进程之前打开了一个tty就存在singnal里, 这个dev/tty就是取出来的.&lt;/p&gt;

&lt;p&gt;}&lt;/p&gt;

&lt;h1 id=&#34;总结下-打开-dev-console-会从console-drivers-最终到达tty-driver&#34;&gt;总结下, 打开/dev/console, 会从console_drivers, 最终到达tty_driver.&lt;/h1&gt;

&lt;h1 id=&#34;这和-dev-tty-dev-ttys-从tty-drivers-差不多&#34;&gt;这和/dev/tty* /dev/ttyS* 从tty_drivers, 差不多.&lt;/h1&gt;

&lt;p&gt;// 这么输出为什么不会打窜了?&lt;br /&gt;
// How ctrl alt Fn work?&lt;br /&gt;
// echo xxx /dev/tty in serial tty_lookup_driver&lt;br /&gt;
// 另一个问题, serial 的terminal?&lt;/p&gt;

&lt;h1 id=&#34;the-perspective&#34;&gt;The perspective&lt;/h1&gt;

&lt;p&gt;/dev/*&lt;br /&gt;
vfs&lt;br /&gt;
chrdev&lt;br /&gt;
tty_fops&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;gt;tty core&lt;br /&gt;
    ld_ops &amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;ndash;&amp;gt; tty line discipline(for read, write)&lt;br /&gt;
tty_driver con_ops/uart_ops&amp;mdash;&amp;mdash;&amp;ndash;&amp;gt; tty driver and tty_operations&lt;br /&gt;
HW&lt;br /&gt;
There are three different types of tty drivers: console, serial port, and pty.&lt;br /&gt;
serial8250_default_handle_irq&lt;br /&gt;
UART console&lt;br /&gt;
              |&amp;mdash;- Virtual terminal &amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;ndash; VT console&lt;br /&gt;
              |                 | &amp;ndash; VT console&lt;br /&gt;
              |&amp;mdash;-&lt;br /&gt;
        Terminal&amp;ndash;|&lt;br /&gt;
              |&amp;mdash;-&lt;/p&gt;

&lt;h1 id=&#34;what-about-console&#34;&gt;What about console?&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;early_con&lt;br /&gt;
start_kernel or setup_arch(arm)-&amp;gt;parse_early_param-&amp;gt;do_early_param-&amp;gt;p-&amp;gt;setup_func()= setup_early_printk-&amp;gt;register_console&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;vga_con&lt;br /&gt;
start_kerenl-&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
// All about vga console&lt;br /&gt;
set_arch-&amp;gt; conswitchp = &amp;amp;vga_con; or conswitchp = &amp;amp;dummy_con;&lt;br /&gt;
console_init-&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
        tty_ldisc_begin-&amp;gt;tty_register_ldisc(N_TTY, &amp;amp;tty_ldisc_N_TTY);&lt;br /&gt;
    console_initcall(con_init);&lt;br /&gt;
console_initcall(serial8250_console_init)&lt;br /&gt;
    con_init-&amp;gt;&lt;br /&gt;
    {&lt;br /&gt;
        // vc-&amp;gt;vc_sw-&amp;gt;con_putcs is DUMMY&lt;br /&gt;
        //内存映射64KB or 32KB的VGA区域. 启动VGA&lt;br /&gt;
        conswitchp-&amp;gt;con_startup = vgacon_startup -&amp;gt;vga_vram_base = VGA_MAP_MEM(vga_vram_base, vga_vram_size);&lt;br /&gt;
        con_driver_map[0~MAX_NR_CONSOLES] = conswitchp; //空间换时间&lt;br /&gt;
        // 核心内容!&lt;br /&gt;
        for (currcons = 0; currcons &amp;lt; MIN_NR_CONSOLES; currcons++) {&lt;br /&gt;
            // 给vc_cons[currcons].d分配内存&lt;br /&gt;
            vc_cons[currcons].d = vc = kzalloc(sizeof(struct vc_data), GFP_NOWAIT);&lt;br /&gt;
            INIT_WORK(&amp;amp;vc_cons[currcons].SAK_work, vc_SAK);&lt;br /&gt;
            //初始化vc_cons[currcons].d&lt;br /&gt;
            tty_port_init(&amp;amp;vc-&amp;gt;port);&lt;br /&gt;
            // 继续初始化, 主要是确定screenbuf size&lt;br /&gt;
            visual_init(vc, currcons, 1);&lt;br /&gt;
            // 给vc_screenbuf分配内存&lt;br /&gt;
            vc-&amp;gt;vc_screenbuf = kzalloc(vc-&amp;gt;vc_screenbuf_size, GFP_NOWAIT);&lt;br /&gt;
            vc_init(vc, vc-&amp;gt;vc_rows, vc-&amp;gt;vc_cols,&lt;br /&gt;
                currcons || !vc-&amp;gt;vc_sw-&amp;gt;con_save_screen);&lt;br /&gt;
        }&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;    [    0.000000] Console: colour VGA+ 80x25
    register_console(&amp;amp;vt_console_driver);//vt_console can use vgacon writing.
}
serial8250_console_init-&amp;gt;register_console(&amp;amp;serial8250_console) to console_drivers; exclusive_console.
[    0.000000] console [tty0] enabled
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;}&lt;br /&gt;
module_init(serial8250_init);??&lt;br /&gt;
}&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;fbcon&lt;br /&gt;
register_framebuffer-&amp;gt; do_take_over_console -&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
do_register_con_driver-&amp;gt;csw-&amp;gt;con_startup();registered_con_driver&lt;br /&gt;
do_bind_con_driver -&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
    [    3.882220] Console: switching to colour dummy device 80x25&lt;br /&gt;
    [    4.720732] Console: switching to colour frame buffer device 170x48&lt;br /&gt;
}&lt;br /&gt;
}&lt;/p&gt;

&lt;h2 id=&#34;vga-text-console-printk-write&#34;&gt;VGA text console printk &amp;amp; write&lt;/h2&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;kernel space&lt;br /&gt;
printk-&amp;gt; &amp;hellip;-&amp;gt;log_buf&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;userspace for ttyN&lt;br /&gt;
tty_fops-&amp;gt;write=tty_write-&amp;gt; tty_ldisc_N_TTY-&amp;gt;write=n_tty_write-&amp;gt; tty_driver-&amp;gt;ops=con_ops-&amp;gt;write=con_write-&amp;gt;do_con_write&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;agent&lt;br /&gt;
console_drivers-&amp;gt;vt_console_driver-&amp;gt;serial8250_console-&amp;gt;NULL&lt;br /&gt;
console_unlock-&amp;gt;..-&amp;gt;__call_console_drivers-&amp;gt; console_drivers-&amp;gt;write = vt_console_print&lt;br /&gt;
{&lt;br /&gt;
//保存到screen buf, vga_con也什么不做啊.&lt;br /&gt;
scr_writew((vc-&amp;gt;vc_attr &amp;lt;&amp;lt; 8) + c, (unsigned short *)vc-&amp;gt;vc_pos);&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;}&lt;/p&gt;

&lt;h1 id=&#34;what-about-tty&#34;&gt;What about tty&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;onset&lt;br /&gt;
console_init-&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
tty_ldisc_begin-&amp;gt;tty_register_ldisc(N_TTY, &amp;amp;tty_ldisc_N_TTY);&lt;br /&gt;
console_initcall(con_init);&lt;br /&gt;
console_initcall(serial8250_console_init)&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;N_TTY:&lt;a href=&#34;http://www.linux.it/~rubini/docs/serial/serial.html&#34;&gt;Serial Drivers by Alessandro Rubini&lt;/a&gt;&lt;br /&gt;
fs_initcall:chr_dev_init-&amp;gt;drivers/tty/tty_io.c: tty_init-&amp;gt;&lt;br /&gt;
{&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;/* /dev/tty0 = /dev/console console_ops */
/* /dev/tty = the /dev/ttyN whererun echo /dev/tty tty_ops */

//&amp;quot;/dev/tty&amp;quot;,
cdev_init(&amp;amp;tty_cdev, &amp;amp;tty_fops);
&amp;quot;/dev/console&amp;quot;
cdev_init(&amp;amp;console_cdev, &amp;amp;console_fops);
vty_init-&amp;gt;
{
    //&amp;quot;dev/tty0&amp;quot;    
    cdev_init(&amp;amp;vc0_cdev, console_fops); 
    //&amp;quot;/dev/ttyN&amp;quot;
    tty_register_driver-&amp;gt;
    {
        // What does tty_register_driver do ?
        // Alloc and register chr dev region.
        // Add cdev with tty_ops and above region.
        // Register tty device
        // Why do we register tty devices?
        // These devices must be used in some place.
        // After registering itself, the driver registers the devices it controls through the tty_register_device function. 
        // 原来是把major 和minor做成dev_t放到driver-&amp;gt;cdevs[index].dev里面了.
        // 也就是说driver-&amp;gt;cdevs[index]就是tty driver控制的device啊, 怪不得cdev_init(&amp;amp;driver-&amp;gt;cdevs[index], &amp;amp;tty_fops);
        // 那么看来用到的时候就是open了, 竟然没有不过有个tty_get_device用了tty_class
        // tty_register_device_attr-&amp;gt;device_register-&amp;gt;device_add-&amp;gt;klist_add_tail(&amp;amp;dev-&amp;gt;knode_class,&amp;amp;dev-&amp;gt;class-&amp;gt;p-&amp;gt;klist_devices)
        // 果然是在open tty-&amp;gt;dev = tty_get_device(tty);在alloc_tty_struct
        // 不知道这个tty-&amp;gt;dev在哪里用, 不管他了.
        tty_register_device(_attr) -&amp;gt;tty_cdev_add-&amp;gt; cdev_init(&amp;amp;driver-&amp;gt;cdevs[index], &amp;amp;tty_fops);
    }
    kbd_init
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;}&lt;br /&gt;
device_init:serial8250_init-&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
    // In this function we decide &amp;ldquo;/dev/ttyS*&amp;rdquo;&lt;br /&gt;
    // dmesg |grep Serial&lt;br /&gt;
    // [    0.696341] Serial: &lt;sup&gt;8250&lt;/sup&gt;&amp;frasl;&lt;sub&gt;16550&lt;/sub&gt; driver, 32 ports, IRQ sharing enabled&lt;br /&gt;
    // serial8250.c -&amp;gt; tty_io.c&lt;br /&gt;
    serial8250_reg.nr = UART_NR;&lt;br /&gt;
    ret = uart_register_driver(&amp;amp;serial8250_reg);&lt;br /&gt;
    tty_driver set to uart_driver by uart_register_driver -&amp;gt;&lt;br /&gt;
    {&lt;br /&gt;
        drv-&amp;gt;state = kzalloc        //uart_state&lt;br /&gt;
        normal-&amp;gt;driver_state    = drv; //args struct uart_driver *drv = &amp;amp;serial8250_reg&lt;br /&gt;
        tty_set_operations(normal, &amp;amp;uart_ops);&lt;br /&gt;
        struct tty_port &lt;em&gt;port = &amp;amp;state-&amp;gt;port&lt;br /&gt;
        tty_port_init(port);&lt;br /&gt;
        port-&amp;gt;ops = &amp;amp;uart_port_ops; //tty_port&lt;br /&gt;
        // We register &amp;ldquo;/dev/ttyS&lt;/em&gt;&amp;rdquo; files here.&lt;br /&gt;
        static struct uart_driver serial8250_reg = {&lt;br /&gt;
            .owner                  = THIS_MODULE,&lt;br /&gt;
            .driver_name            = &amp;ldquo;serial&amp;rdquo;,&lt;br /&gt;
            .dev_name               = &amp;ldquo;ttyS&amp;rdquo;,&lt;br /&gt;
            .major                  = TTY_MAJOR,&lt;br /&gt;
            .minor                  = 64,&lt;br /&gt;
            .cons                   = SERIAL8250_CONSOLE,&lt;br /&gt;
        };&lt;br /&gt;
        retval = tty_register_driver(normal); -&amp;gt; register_chrdev_region(dev, driver-&amp;gt;num, driver-&amp;gt;name) //32, ttyS?*? should be tty_ops&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;}
serial8250_register_ports(&amp;amp;serial8250_reg, &amp;amp;serial8250_isa_devs-&amp;gt;dev);
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;}&lt;br /&gt;
* nucleus&lt;br /&gt;
tty_write-&amp;gt;ld-&amp;gt;ops-&amp;gt;write=n_tty_write-&amp;gt;(tty_struct tty-&amp;gt;ops-&amp;gt;write)=uart_write-&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
    struct uart_state *state = tty-&amp;gt;driver_data;&lt;br /&gt;
    port = state-&amp;gt;uart_port;&lt;br /&gt;
    circ = &amp;amp;state-&amp;gt;xmit;&lt;br /&gt;
    memcpy(circ-&amp;gt;buf + circ-&amp;gt;head, buf, c);&lt;br /&gt;
    uart_start-&amp;gt;__uart_start-&amp;gt;(uart_port-&amp;gt;ops-&amp;gt;start_tx(port)); //&amp;amp;uart_port_ops ?? uart_ops??&lt;br /&gt;
}&lt;/p&gt;

&lt;h1 id=&#34;what-about-pseudoterminal&#34;&gt;What about Pseudoterminal&lt;/h1&gt;

&lt;p&gt;/dev/ptmx is the &amp;ldquo;pseudo-terminal master multiplexer&amp;rdquo;. from wikipedia&lt;br /&gt;
static struct tty_driver *ptm_driver;&lt;br /&gt;
static struct tty_driver *pts_driver;&lt;br /&gt;
module_init(pty_init)-&amp;gt;unix98_pty_init-&amp;gt;&lt;br /&gt;
{&lt;br /&gt;
    tty_set_operations(ptm_driver, &amp;amp;ptm_unix98_ops);&lt;br /&gt;
    tty_register_driver(ptm_driver)&lt;br /&gt;
    tty_set_operations(pts_driver, &amp;amp;pty_unix98_ops);&lt;br /&gt;
    tty_register_driver(pts_driver)&lt;br /&gt;
    ptmx_fops = tty_fops;&lt;br /&gt;
    ptmx_fops.open = ptmx_open;&lt;br /&gt;
    cdev_init(&amp;amp;ptmx_cdev, &amp;amp;ptmx_fops);&lt;br /&gt;
}&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How to use ptmx?&lt;br /&gt;
&lt;br /&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&#34;tty-drivers&#34;&gt;tty drivers&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;cat /proc/tty/drivers&lt;br /&gt;
/dev/tty             /dev/tty        5       0 system:/dev/tty&lt;br /&gt;
/dev/console         /dev/console    5       1 system:console&lt;br /&gt;
/dev/ptmx            /dev/ptmx       5       2 system&lt;br /&gt;
/dev/vc/0            /dev/vc/0       4       0 system:vtmaster&lt;br /&gt;
usbserial            /dev/ttyUSB   188 0-511 serial&lt;br /&gt;
serial               /dev/ttyS       4 64-95 serial&lt;br /&gt;
pty_slave            /dev/pts      136 0-1048575 pty:slave&lt;br /&gt;
pty_master           /dev/ptm      128 0-1048575 pty:master&lt;br /&gt;
unknown              /dev/tty        4 1-63 console&lt;br /&gt;
&lt;br /&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&#34;question&#34;&gt;Question?&lt;/h1&gt;

&lt;p&gt;what is /dev/vcs?&lt;/p&gt;

&lt;h1 id=&#34;backup&#34;&gt;Backup&lt;/h1&gt;

&lt;p&gt;./drivers//tty/vt/vt.c:3042:        register_chrdev_region(MKDEV(TTY_MAJOR, 0), 1, &amp;ldquo;/dev/vc/0&amp;rdquo;) &amp;lt; 0)&lt;br /&gt;
./drivers//tty/vt/vc_screen.c:644:  if (register_chrdev(VCS_MAJOR, &amp;ldquo;vcs&amp;rdquo;, &amp;amp;vcs_fops))&lt;br /&gt;
./drivers//tty/tty_io.c:3377:       error = register_chrdev_region(dev, driver-&amp;gt;num, driver-&amp;gt;name);&lt;br /&gt;
./drivers//tty/tty_io.c:3414:   unregister_chrdev_region(dev, driver-&amp;gt;num);&lt;br /&gt;
./drivers//tty/tty_io.c:3430:   unregister_chrdev_region(MKDEV(driver-&amp;gt;major, driver-&amp;gt;minor_start),&lt;br /&gt;
./drivers//tty/tty_io.c:3607:       register_chrdev_region(MKDEV(TTYAUX_MAJOR, 0), 1, &amp;ldquo;/dev/tty&amp;rdquo;) &amp;lt; 0)&lt;br /&gt;
./drivers//tty/tty_io.c:3613:       register_chrdev_region(MKDEV(TTYAUX_MAJOR, 1), 1, &amp;ldquo;/dev/console&amp;rdquo;) &amp;lt; 0)&lt;br /&gt;
./drivers//tty/pty.c:841:       register_chrdev_region(MKDEV(TTYAUX_MAJOR, 2), 1, &amp;ldquo;/dev/ptmx&amp;rdquo;) &amp;lt; 0)&lt;/p&gt;

&lt;h1 id=&#34;hugh-in-n-tty-write&#34;&gt;hugh in n_tty_write&lt;/h1&gt;

&lt;p&gt;uart_flush_buffer-&amp;gt; tty_wakeup&lt;br /&gt;
serial8250_handle_port-&amp;gt; transmit_chars&lt;br /&gt;
n_tty_read/poll-&amp;gt;input_available_p-&amp;gt;flush_to_ldisc-&amp;gt;n_tty_receive_buf-&amp;gt;uart_flush_chars&lt;br /&gt;
n_tty_write-&amp;gt;uart_flush_chars-&amp;gt;uart_start&lt;br /&gt;
n_tty_write-&amp;gt;uart_write-&amp;gt; uart_start-&amp;gt;start_tx -&amp;gt; serial8250_start_tx -&amp;gt; transmit_chars-&amp;gt;uart_write_wakeup -&amp;gt;uart_tasklet_action-&amp;gt;tty_wakeup&lt;/p&gt;

&lt;h1 id=&#34;echo-char&#34;&gt;Echo char&lt;/h1&gt;

&lt;p&gt;===serial chipset&lt;br /&gt;
serial8250_interrupt&lt;br /&gt;
seirial8250_handle_port&lt;br /&gt;
receive_chars&lt;/p&gt;

&lt;p&gt;===serial abstruction&lt;br /&gt;
uart_insert_char&lt;/p&gt;

&lt;p&gt;===terminal device&lt;br /&gt;
tty_insert_flip_char&lt;/p&gt;

&lt;p&gt;receive_chars-&amp;gt;tty_flip_buffer_push -&amp;gt;flush_to_ldisc-&amp;gt;&lt;br /&gt;
=== Line discipline&lt;br /&gt;
disc-&amp;gt;receive_buf=n_tty_receive_buf-&amp;gt;n_tty_receive_char-&amp;gt;echo_char&lt;/p&gt;

&lt;h1 id=&#34;uart-port&#34;&gt;uart_port&lt;/h1&gt;

&lt;p&gt;serial8250_register_ports&lt;br /&gt;
struct uart_8250_port *up = &amp;amp;serial8250_ports[i];&lt;br /&gt;
uart_add_one_port(drv, &amp;amp;up-&amp;gt;port);&lt;/p&gt;
</description>
    </item>
    
  </channel>
</rss>