{"componentChunkName":"component---src-templates-post-template-js","path":"/unix-xv6-008-kernel-main-05-en","result":{"data":{"markdownRemark":{"id":"921de312-c234-55dd-80f5-10a43c4fb749","html":"<blockquote>\n<p>This page has been machine-translated from the <a href=\"/unix-xv6-008-kernel-main-05\">original page</a>.</p>\n</blockquote>\n<p>Inspired by <a href=\"https://amzn.to/3q8TU3K\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">An Introduction to OS Code Reading: Learning Kernel Internals with UNIX V6</a>, I’m reading <a href=\"https://github.com/mit-pdos/xv6-public\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">xv6 OS</a>.</p>\n<p>Because UNIX V6 itself does not run on x86 CPUs, I decided to read the source of <a href=\"https://github.com/kash1064/xv6-public\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">kash1064/xv6-public: xv6 OS</a>, a fork of the <a href=\"https://github.com/mit-pdos/xv6-public\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">xv6 OS</a> repository that makes UNIX V6 run on the x86 architecture.</p>\n<p>In the <a href=\"/unix-xv6-007-kernel-main-04-en\">previous article</a>, I looked at how the <code class=\"language-text\">lapicinit</code> function called from <code class=\"language-text\">main</code> configures the Local APIC.</p>\n<p>This time, I will trace the behavior of the <code class=\"language-text\">seginit</code> function.</p>\n<!-- omit in toc -->\n<h2 id=\"table-of-contents\" style=\"position:relative;\"><a href=\"#table-of-contents\" aria-label=\"table of contents permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Table of Contents</h2>\n<ul>\n<li>\n<p><a href=\"#the-seginit-function\">The <code class=\"language-text\">seginit</code> function</a></p>\n<ul>\n<li><a href=\"#looking-up-cpus\">Looking up <code class=\"language-text\">cpus</code></a></li>\n<li><a href=\"#setting-the-gdt\">Setting the GDT</a></li>\n</ul>\n</li>\n<li><a href=\"#summary\">Summary</a></li>\n<li><a href=\"#reference-books\">Reference Books</a></li>\n</ul>\n<h2 id=\"the-seginit-function\" style=\"position:relative;\"><a href=\"#the-seginit-function\" aria-label=\"the seginit function permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>The <code class=\"language-text\">seginit</code> function</h2>\n<p>The <code class=\"language-text\">seginit</code> function sets the kernel segment descriptors on the CPU.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token function\">seginit</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>       <span class=\"token comment\">// segment descriptors</span></code></pre></div>\n<p>The <code class=\"language-text\">seginit</code> function is defined in <code class=\"language-text\">vm.c</code> as follows.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">// Set up CPU's kernel segment descriptors.</span>\n<span class=\"token comment\">// Run once on entry on each CPU.</span>\n<span class=\"token keyword\">void</span> <span class=\"token function\">seginit</span><span class=\"token punctuation\">(</span><span class=\"token keyword\">void</span><span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">{</span>\n  <span class=\"token keyword\">struct</span> <span class=\"token class-name\">cpu</span> <span class=\"token operator\">*</span>c<span class=\"token punctuation\">;</span>\n\n  <span class=\"token comment\">// Map \"logical\" addresses to virtual addresses using identity map.</span>\n  <span class=\"token comment\">// Cannot share a CODE descriptor for both kernel and user</span>\n  <span class=\"token comment\">// because it would have to have DPL_USR, but the CPU forbids</span>\n  <span class=\"token comment\">// an interrupt from CPL=0 to DPL=3.</span>\n  c <span class=\"token operator\">=</span> <span class=\"token operator\">&amp;</span>cpus<span class=\"token punctuation\">[</span><span class=\"token function\">cpuid</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  c<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">[</span>SEG_KCODE<span class=\"token punctuation\">]</span> <span class=\"token operator\">=</span> <span class=\"token function\">SEG</span><span class=\"token punctuation\">(</span>STA_X<span class=\"token operator\">|</span>STA_R<span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0xffffffff</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  c<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">[</span>SEG_KDATA<span class=\"token punctuation\">]</span> <span class=\"token operator\">=</span> <span class=\"token function\">SEG</span><span class=\"token punctuation\">(</span>STA_W<span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0xffffffff</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  c<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">[</span>SEG_UCODE<span class=\"token punctuation\">]</span> <span class=\"token operator\">=</span> <span class=\"token function\">SEG</span><span class=\"token punctuation\">(</span>STA_X<span class=\"token operator\">|</span>STA_R<span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0xffffffff</span><span class=\"token punctuation\">,</span> DPL_USER<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  c<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">[</span>SEG_UDATA<span class=\"token punctuation\">]</span> <span class=\"token operator\">=</span> <span class=\"token function\">SEG</span><span class=\"token punctuation\">(</span>STA_W<span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0xffffffff</span><span class=\"token punctuation\">,</span> DPL_USER<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  <span class=\"token function\">lgdt</span><span class=\"token punctuation\">(</span>c<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">,</span> <span class=\"token keyword\">sizeof</span><span class=\"token punctuation\">(</span>c<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p>Let’s read through the source code.</p>\n<p>First, the <code class=\"language-text\">cpu</code> structure was defined in <code class=\"language-text\">proc.h</code> as follows.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">// Per-CPU state</span>\n<span class=\"token keyword\">struct</span> <span class=\"token class-name\">cpu</span> <span class=\"token punctuation\">{</span>\n  uchar apicid<span class=\"token punctuation\">;</span>                <span class=\"token comment\">// Local APIC ID</span>\n  <span class=\"token keyword\">struct</span> <span class=\"token class-name\">context</span> <span class=\"token operator\">*</span>scheduler<span class=\"token punctuation\">;</span>   <span class=\"token comment\">// swtch() here to enter scheduler</span>\n  <span class=\"token keyword\">struct</span> <span class=\"token class-name\">taskstate</span> ts<span class=\"token punctuation\">;</span>         <span class=\"token comment\">// Used by x86 to find stack for interrupt</span>\n  <span class=\"token keyword\">struct</span> <span class=\"token class-name\">segdesc</span> gdt<span class=\"token punctuation\">[</span>NSEGS<span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>   <span class=\"token comment\">// x86 global descriptor table</span>\n  <span class=\"token keyword\">volatile</span> uint started<span class=\"token punctuation\">;</span>       <span class=\"token comment\">// Has the CPU started?</span>\n  <span class=\"token keyword\">int</span> ncli<span class=\"token punctuation\">;</span>                    <span class=\"token comment\">// Depth of pushcli nesting.</span>\n  <span class=\"token keyword\">int</span> intena<span class=\"token punctuation\">;</span>                  <span class=\"token comment\">// Were interrupts enabled before pushcli?</span>\n  <span class=\"token keyword\">struct</span> <span class=\"token class-name\">proc</span> <span class=\"token operator\">*</span>proc<span class=\"token punctuation\">;</span>           <span class=\"token comment\">// The process running on this cpu or null</span>\n<span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span></code></pre></div>\n<p>The <code class=\"language-text\">cpu</code> structure is stored in the array <code class=\"language-text\">cpus</code>.</p>\n<p>As seen in the <a href=\"/unix-xv6-006-kernel-main-03-en\">Multiprocessor Edition</a>, this is an array supporting up to 8 CPUs, as set by <code class=\"language-text\">NCPU</code> defined in <code class=\"language-text\">param.h</code>.</p>\n<h3 id=\"looking-up-cpus\" style=\"position:relative;\"><a href=\"#looking-up-cpus\" aria-label=\"looking up cpus permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Looking up <code class=\"language-text\">cpus</code></h3>\n<p>Let’s look at the section that retrieves the <code class=\"language-text\">cpu</code> structure from the <code class=\"language-text\">cpus</code> array.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token keyword\">struct</span> <span class=\"token class-name\">cpu</span> <span class=\"token operator\">*</span>c<span class=\"token punctuation\">;</span>\nc <span class=\"token operator\">=</span> <span class=\"token operator\">&amp;</span>cpus<span class=\"token punctuation\">[</span><span class=\"token function\">cpuid</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span></code></pre></div>\n<p>The <code class=\"language-text\">cpuid</code> function is defined in <code class=\"language-text\">proc.c</code> as follows.</p>\n<p>It returns the value obtained by subtracting <code class=\"language-text\">cpus</code> from the return value of <code class=\"language-text\">mycpu</code> (I wonder what exactly it’s doing…).</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">// Must be called with interrupts disabled</span>\n<span class=\"token keyword\">int</span> <span class=\"token function\">cpuid</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span> <span class=\"token punctuation\">{</span>\n  <span class=\"token keyword\">return</span> <span class=\"token function\">mycpu</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token operator\">-</span>cpus<span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span>\n\n<span class=\"token comment\">// Must be called with interrupts disabled to avoid the caller being</span>\n<span class=\"token comment\">// rescheduled between reading lapicid and running through the loop.</span>\n<span class=\"token keyword\">struct</span> <span class=\"token class-name\">cpu</span><span class=\"token operator\">*</span> <span class=\"token function\">mycpu</span><span class=\"token punctuation\">(</span><span class=\"token keyword\">void</span><span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">{</span>\n  <span class=\"token keyword\">int</span> apicid<span class=\"token punctuation\">,</span> i<span class=\"token punctuation\">;</span>\n  \n  <span class=\"token keyword\">if</span><span class=\"token punctuation\">(</span><span class=\"token function\">readeflags</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token operator\">&amp;</span>FL_IF<span class=\"token punctuation\">)</span> <span class=\"token function\">panic</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"mycpu called with interrupts enabled\\n\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  \n  apicid <span class=\"token operator\">=</span> <span class=\"token function\">lapicid</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  <span class=\"token comment\">// APIC IDs are not guaranteed to be contiguous. Maybe we should have</span>\n  <span class=\"token comment\">// a reverse map, or reserve a register to store &amp;cpus[i].</span>\n  <span class=\"token keyword\">for</span> <span class=\"token punctuation\">(</span>i <span class=\"token operator\">=</span> <span class=\"token number\">0</span><span class=\"token punctuation\">;</span> i <span class=\"token operator\">&lt;</span> ncpu<span class=\"token punctuation\">;</span> <span class=\"token operator\">++</span>i<span class=\"token punctuation\">)</span> <span class=\"token punctuation\">{</span>\n    <span class=\"token keyword\">if</span> <span class=\"token punctuation\">(</span>cpus<span class=\"token punctuation\">[</span>i<span class=\"token punctuation\">]</span><span class=\"token punctuation\">.</span>apicid <span class=\"token operator\">==</span> apicid<span class=\"token punctuation\">)</span> <span class=\"token keyword\">return</span> <span class=\"token operator\">&amp;</span>cpus<span class=\"token punctuation\">[</span>i<span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span>\n  <span class=\"token function\">panic</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"unknown apicid\\n\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p>The key part of <code class=\"language-text\">mycpu</code> is the following code.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\">apicid <span class=\"token operator\">=</span> <span class=\"token function\">lapicid</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token comment\">// APIC IDs are not guaranteed to be contiguous. Maybe we should have</span>\n<span class=\"token comment\">// a reverse map, or reserve a register to store &amp;cpus[i].</span>\n<span class=\"token keyword\">for</span> <span class=\"token punctuation\">(</span>i <span class=\"token operator\">=</span> <span class=\"token number\">0</span><span class=\"token punctuation\">;</span> i <span class=\"token operator\">&lt;</span> ncpu<span class=\"token punctuation\">;</span> <span class=\"token operator\">++</span>i<span class=\"token punctuation\">)</span> <span class=\"token punctuation\">{</span>\n  <span class=\"token keyword\">if</span> <span class=\"token punctuation\">(</span>cpus<span class=\"token punctuation\">[</span>i<span class=\"token punctuation\">]</span><span class=\"token punctuation\">.</span>apicid <span class=\"token operator\">==</span> apicid<span class=\"token punctuation\">)</span> <span class=\"token keyword\">return</span> <span class=\"token operator\">&amp;</span>cpus<span class=\"token punctuation\">[</span>i<span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p>The <code class=\"language-text\">lapicid</code> function is defined in <code class=\"language-text\">lapic.c</code>; it retrieves the APICID from the Local APIC and returns it after right-shifting by 24 bits.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token keyword\">int</span> <span class=\"token function\">lapicid</span><span class=\"token punctuation\">(</span><span class=\"token keyword\">void</span><span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">{</span>\n  <span class=\"token keyword\">if</span> <span class=\"token punctuation\">(</span><span class=\"token operator\">!</span>lapic<span class=\"token punctuation\">)</span> <span class=\"token keyword\">return</span> <span class=\"token number\">0</span><span class=\"token punctuation\">;</span>\n  <span class=\"token keyword\">return</span> lapic<span class=\"token punctuation\">[</span>ID<span class=\"token punctuation\">]</span> <span class=\"token operator\">>></span> <span class=\"token number\">24</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p>The variable <code class=\"language-text\">lapic</code> holds the address of the Local APIC, which was stored in <code class=\"language-text\">mp.c</code> as seen in the <a href=\"/unix-xv6-006-kernel-main-03-en\">Multiprocessor Edition</a>.</p>\n<p>According to Intel’s multiprocessor specification (section 5-1), the Local APIC is placed at the base memory address <code class=\"language-text\">0x0FEE00000</code>, and Local APIC IDs are assigned consecutively starting from 0 on the hardware.</p>\n<p>Reference: <a href=\"https://www.manualslib.com/manual/77733/Intel-Multiprocessor.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">INTEL MULTIPROCESSOR SPECIFICATION Pdf Download | ManualsLib</a></p>\n<p>When I confirmed this in the debugger, the value of <code class=\"language-text\">lapic[ID]</code> was 0 on the first call.</p>\n<p>Therefore the return value of <code class=\"language-text\">lapicid</code> is also 0.</p>\n<p>This means <code class=\"language-text\">apicid = lapicid();</code> stores 0 in <code class=\"language-text\">apicid</code>.</p>\n<p>I also confirmed with the debugger that <code class=\"language-text\">cpus[i].apicid</code> is also 0 and matches <code class=\"language-text\">apicid</code>, so <code class=\"language-text\">&amp;cpus[i]</code> returns <code class=\"language-text\">&amp;cpus[0]</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token keyword\">for</span> <span class=\"token punctuation\">(</span>i <span class=\"token operator\">=</span> <span class=\"token number\">0</span><span class=\"token punctuation\">;</span> i <span class=\"token operator\">&lt;</span> ncpu<span class=\"token punctuation\">;</span> <span class=\"token operator\">++</span>i<span class=\"token punctuation\">)</span> <span class=\"token punctuation\">{</span>\n  <span class=\"token keyword\">if</span> <span class=\"token punctuation\">(</span>cpus<span class=\"token punctuation\">[</span>i<span class=\"token punctuation\">]</span><span class=\"token punctuation\">.</span>apicid <span class=\"token operator\">==</span> apicid<span class=\"token punctuation\">)</span> <span class=\"token keyword\">return</span> <span class=\"token operator\">&amp;</span>cpus<span class=\"token punctuation\">[</span>i<span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p>Therefore <code class=\"language-text\">return mycpu()-cpus;</code> is also 0, and I confirmed that the return value of the first <code class=\"language-text\">cpuid</code> call is 0.</p>\n<p>This means <code class=\"language-text\">c = &amp;cpus[cpuid()];</code> becomes <code class=\"language-text\">c = &amp;cpus[0];</code> on the first run.</p>\n<h3 id=\"setting-the-gdt\" style=\"position:relative;\"><a href=\"#setting-the-gdt\" aria-label=\"setting the gdt permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Setting the GDT</h3>\n<p>So the following lines set values in the <code class=\"language-text\">gdt[NSEGS]</code> element of <code class=\"language-text\">&amp;cpus[0]</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">// Map \"logical\" addresses to virtual addresses using identity map.</span>\n<span class=\"token comment\">// Cannot share a CODE descriptor for both kernel and user</span>\n<span class=\"token comment\">// because it would have to have DPL_USR, but the CPU forbids</span>\n<span class=\"token comment\">// an interrupt from CPL=0 to DPL=3.</span>\nc <span class=\"token operator\">=</span> <span class=\"token operator\">&amp;</span>cpus<span class=\"token punctuation\">[</span><span class=\"token function\">cpuid</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\nc<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">[</span>SEG_KCODE<span class=\"token punctuation\">]</span> <span class=\"token operator\">=</span> <span class=\"token function\">SEG</span><span class=\"token punctuation\">(</span>STA_X<span class=\"token operator\">|</span>STA_R<span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0xffffffff</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\nc<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">[</span>SEG_KDATA<span class=\"token punctuation\">]</span> <span class=\"token operator\">=</span> <span class=\"token function\">SEG</span><span class=\"token punctuation\">(</span>STA_W<span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0xffffffff</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\nc<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">[</span>SEG_UCODE<span class=\"token punctuation\">]</span> <span class=\"token operator\">=</span> <span class=\"token function\">SEG</span><span class=\"token punctuation\">(</span>STA_X<span class=\"token operator\">|</span>STA_R<span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0xffffffff</span><span class=\"token punctuation\">,</span> DPL_USER<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\nc<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">[</span>SEG_UDATA<span class=\"token punctuation\">]</span> <span class=\"token operator\">=</span> <span class=\"token function\">SEG</span><span class=\"token punctuation\">(</span>STA_W<span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0xffffffff</span><span class=\"token punctuation\">,</span> DPL_USER<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token function\">lgdt</span><span class=\"token punctuation\">(</span>c<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">,</span> <span class=\"token keyword\">sizeof</span><span class=\"token punctuation\">(</span>c<span class=\"token operator\">-></span>gdt<span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span></code></pre></div>\n<p><code class=\"language-text\">&amp;cpus[0]</code> is the <code class=\"language-text\">cpu</code> structure described above, and <code class=\"language-text\">gdt</code> is defined as follows.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token keyword\">struct</span> <span class=\"token class-name\">segdesc</span> gdt<span class=\"token punctuation\">[</span>NSEGS<span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>   <span class=\"token comment\">// x86 global descriptor table</span></code></pre></div>\n<p>The <code class=\"language-text\">segdesc</code> structure is a structure defined in <code class=\"language-text\">mmu.h</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">ifndef</span> <span class=\"token expression\">__ASSEMBLER__</span></span>\n<span class=\"token comment\">// Segment Descriptor</span>\n<span class=\"token keyword\">struct</span> <span class=\"token class-name\">segdesc</span> <span class=\"token punctuation\">{</span>\n  uint lim_15_0 <span class=\"token operator\">:</span> <span class=\"token number\">16</span><span class=\"token punctuation\">;</span>  <span class=\"token comment\">// Low bits of segment limit</span>\n  uint base_15_0 <span class=\"token operator\">:</span> <span class=\"token number\">16</span><span class=\"token punctuation\">;</span> <span class=\"token comment\">// Low bits of segment base address</span>\n  uint base_23_16 <span class=\"token operator\">:</span> <span class=\"token number\">8</span><span class=\"token punctuation\">;</span> <span class=\"token comment\">// Middle bits of segment base address</span>\n  uint type <span class=\"token operator\">:</span> <span class=\"token number\">4</span><span class=\"token punctuation\">;</span>       <span class=\"token comment\">// Segment type (see STS_ constants)</span>\n  uint s <span class=\"token operator\">:</span> <span class=\"token number\">1</span><span class=\"token punctuation\">;</span>          <span class=\"token comment\">// 0 = system, 1 = application</span>\n  uint dpl <span class=\"token operator\">:</span> <span class=\"token number\">2</span><span class=\"token punctuation\">;</span>        <span class=\"token comment\">// Descriptor Privilege Level</span>\n  uint p <span class=\"token operator\">:</span> <span class=\"token number\">1</span><span class=\"token punctuation\">;</span>          <span class=\"token comment\">// Present</span>\n  uint lim_19_16 <span class=\"token operator\">:</span> <span class=\"token number\">4</span><span class=\"token punctuation\">;</span>  <span class=\"token comment\">// High bits of segment limit</span>\n  uint avl <span class=\"token operator\">:</span> <span class=\"token number\">1</span><span class=\"token punctuation\">;</span>        <span class=\"token comment\">// Unused (available for software use)</span>\n  uint rsv1 <span class=\"token operator\">:</span> <span class=\"token number\">1</span><span class=\"token punctuation\">;</span>       <span class=\"token comment\">// Reserved</span>\n  uint db <span class=\"token operator\">:</span> <span class=\"token number\">1</span><span class=\"token punctuation\">;</span>         <span class=\"token comment\">// 0 = 16-bit segment, 1 = 32-bit segment</span>\n  uint g <span class=\"token operator\">:</span> <span class=\"token number\">1</span><span class=\"token punctuation\">;</span>          <span class=\"token comment\">// Granularity: limit scaled by 4K when set</span>\n  uint base_31_24 <span class=\"token operator\">:</span> <span class=\"token number\">8</span><span class=\"token punctuation\">;</span> <span class=\"token comment\">// High bits of segment base address</span>\n<span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span></code></pre></div>\n<p>Incidentally, <code class=\"language-text\">NSEGS</code> is also defined as the constant 6 in <code class=\"language-text\">mmu.h</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">// cpu->gdt[NSEGS] holds the above segments.</span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">NSEGS</span>     <span class=\"token expression\"><span class=\"token number\">6</span></span></span></code></pre></div>\n<p>The <code class=\"language-text\">segdesc</code> structure defined here is a segment descriptor.</p>\n<p><span\n      class=\"gatsby-resp-image-wrapper\"\n      style=\"position: relative; display: block; margin-left: auto; margin-right: auto; max-width: 500px; \"\n    >\n      <a\n    class=\"gatsby-resp-image-link\"\n    href=\"/static/67703d9268cb5e9193a27b143a5996d8/0b533/image-4.png\"\n    style=\"display: block\"\n    target=\"_blank\"\n    rel=\"noopener\"\n  >\n    <span\n    class=\"gatsby-resp-image-background-image\"\n    style=\"padding-bottom: 62.083333333333336%; position: relative; bottom: 0; left: 0; background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAMCAYAAABiDJ37AAAACXBIWXMAARlAAAEZQAGA43XUAAABpklEQVQoz3VSDW+CUAz0//8u3fyYLktchprFOZ18CAo8EFT01it5inFrUovvtdf27rWKokBVVern81ndflfV7f/t7PH7dDppJFbreDyCluc5DoeDXm63W6xWPwiCDYzJsJHo+b7k7JFJHgtZV5al1NT1NJ4pINHzLIPn+YiiLdI0xYfjYLH4xsSZwhcw1/MQ+IE2S+IY67WreUmS4nK5KCAHUkAe7Pd7PbCd4yRGJk2MMVKUXCej13mH66TWroDkg6vxIMtyGAFig1L+M3Iq5pEW60VRamPePwByZV4SeLMJ1QnCCTg97xiDIIDn1qtzVfLN6S9NQP7Q2IkikJf5/AuuFNIpDrklZ5yqaRyAje44tIAmNXCciRSu0W4/YfgyQrfbx2AwvAIulyvEcYLdblcLJQ2bKt8BUoC162pir9cX8CnCMJLiGFY4UmMjzyz/+G/lMAylq4tO5xmj1zdZfaH88f2VRaliUDw++Kb9uTJjFEUyYYDx+B2z2acIsxNOjSpPYILxGXHCJtAdoD6Psn4CVJWR6lFRrm/5Io9Ul5GgLLZ1Tf8Fhqic9oHnDUAAAAAASUVORK5CYII='); background-size: cover; display: block;\"\n  ></span>\n  <picture>\n          <source\n              srcset=\"/static/67703d9268cb5e9193a27b143a5996d8/8ac56/image-4.webp 240w,\n/static/67703d9268cb5e9193a27b143a5996d8/d3be9/image-4.webp 480w,\n/static/67703d9268cb5e9193a27b143a5996d8/b0a15/image-4.webp 500w\"\n              sizes=\"(max-width: 500px) 100vw, 500px\"\n              type=\"image/webp\"\n            />\n          <source\n            srcset=\"/static/67703d9268cb5e9193a27b143a5996d8/8ff5a/image-4.png 240w,\n/static/67703d9268cb5e9193a27b143a5996d8/e85cb/image-4.png 480w,\n/static/67703d9268cb5e9193a27b143a5996d8/0b533/image-4.png 500w\"\n            sizes=\"(max-width: 500px) 100vw, 500px\"\n            type=\"image/png\"\n          />\n          <img\n            class=\"gatsby-resp-image-image\"\n            src=\"/static/67703d9268cb5e9193a27b143a5996d8/0b533/image-4.png\"\n            alt=\"2022/02/image-4.png\"\n            title=\"2022/02/image-4.png\"\n            loading=\"lazy\"\n            style=\"width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0;\"\n          />\n        </picture>\n  </a>\n    </span></p>\n<p>Reference image: <a href=\"http://flint.cs.yale.edu/cs422/doc/24547212.pdf\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Intel SDM vol3</a></p>\n<p>A segment descriptor is the data structure that serves as an entry in the GDT and LDT, which I briefly touched on in <a href=\"/linux-memory-protect-gdt-ldt\">Notes on x86 CPU memory protection mechanisms (GDT and LDT)</a>.</p>\n<p>A segment descriptor notifies the CPU of the segment’s size, address, access permissions, and state.</p>\n<p>On x86 CPUs, this mechanism is used to implement memory protection.</p>\n<p>The segment selectors such as <code class=\"language-text\">SEG_KCODE</code> and the permissions assigned to them were also confirmed when reading the bootstrap, so I will omit them in this article.</p>\n<p>Reference: <a href=\"/unix-xv6-001-bootstrap-en\">Reading xv6OS Thoroughly to Fully Understand the Kernel - Bootstrap Edition -</a></p>\n<h2 id=\"summary\" style=\"position:relative;\"><a href=\"#summary\" aria-label=\"summary permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Summary</h2>\n<p>I have initialized the segment descriptors on the kernel side.</p>\n<p>Next time, I will start with the <code class=\"language-text\">picinit</code> function…</p>\n<h2 id=\"reference-books\" style=\"position:relative;\"><a href=\"#reference-books\" aria-label=\"reference books permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Reference Books</h2>\n<ul>\n<li><a href=\"https://amzn.to/3qZSCY7\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Build an OS in 30 Days!</a></li>\n<li><a href=\"https://amzn.to/3qXYsZX\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Introduction to OS Development from Zero</a></li>\n<li><a href=\"https://amzn.to/3q8TU3K\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">An Introduction to OS Code Reading: Learning Kernel Internals with UNIX V6</a></li>\n<li><a href=\"https://amzn.to/3I6fkVt\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Detailed Linux Kernel</a></li>\n<li><a href=\"https://amzn.to/3JRUdI2\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Build and Understand an OS: Theory and Implementation for Running x86 Computers</a></li>\n</ul>","fields":{"slug":"/unix-xv6-008-kernel-main-05-en","tagSlugs":["/tag/unix-en/","/tag/xv-6-en/","/tag/kernel-en/","/tag/os-en/","/tag/english/"]},"frontmatter":{"date":"2022-02-03","description":"I am learning about kernels by reading the source code of the educational OS xv6OS. In this article, I walk through the behavior of the xv6OS kernel's main function.","tags":["Unix (en)","xv6 (en)","Kernel (en)","OS (en)","English"],"title":"Reading xv6OS Thoroughly to Fully Understand the Kernel - Segment Descriptor Initialization Edition -","socialImage":{"publicURL":"/static/0490ada38a265ebd2cfb9e8a556d291d/unix-xv6-008-kernel-main-05.png"}}}},"pageContext":{"slug":"/unix-xv6-008-kernel-main-05-en"}},"staticQueryHashes":["251939775","401334301","825871152"]}