{"componentChunkName":"component---src-templates-post-template-js","path":"/unix-xv6-002-load-kernel-en","result":{"data":{"markdownRemark":{"id":"8b7ae0ef-1f71-5373-91bb-7816b3e14bae","html":"<blockquote>\n<p>This page has been machine-translated from the <a href=\"/unix-xv6-002-load-kernel\">original page</a>.</p>\n</blockquote>\n<p>I have been reading <a href=\"https://github.com/mit-pdos/xv6-public\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">xv6 OS</a>, inspired by the book <a href=\"https://amzn.to/3q8TU3K\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">はじめてのOSコードリーディング ~UNIX V6で学ぶカーネルのしくみ</a>.</p>\n<p>I want to get better at reverse engineering and deepen my understanding of kernels and operating systems.</p>\n<p><a href=\"https://amzn.to/3I6fkVt\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">詳解 Linuxカーネル</a> felt a bit heavy, so I was looking for somewhere lighter to start. I came across UNIX V6 — an OS with a total of around 10,000 lines of code, which is just barely comprehensible for a human — and became interested.</p>\n<p>However, UNIX V6 itself does not run on x86 CPUs, so I decided to read the source code of <a href=\"https://github.com/kash1064/xv6-public\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">kash1064/xv6-public: xv6 OS</a>, a fork of <a href=\"https://github.com/mit-pdos/xv6-public\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">xv6 OS</a> — which is UNIX V6 adapted to run on x86 architecture.</p>\n<p>Continuing from <a href=\"/unix-xv6-001-bootstrap-en\">the previous article</a>, I will keep reading the xv6 OS source code.</p>\n<p>In the previous article, I read through the xv6 OS bootstrap code and traced it up to the point just before the kernel body is loaded.</p>\n<p>This time, I will trace what actually happens when the kernel is loaded.</p>\n<!-- omit in toc -->\n<h2 id=\"table-of-contents\" style=\"position:relative;\"><a href=\"#table-of-contents\" aria-label=\"table of contents permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Table of Contents</h2>\n<ul>\n<li><a href=\"#loading-the-kernel\">Loading the Kernel</a></li>\n<li>\n<p><a href=\"#building-the-kernel-program\">Building the Kernel Program</a></p>\n<ul>\n<li><a href=\"#linker-script\">Linker Script</a></li>\n<li><a href=\"#linker-script-structure\">Linker Script Structure</a></li>\n<li><a href=\"#defining-the-entry-point\">Defining the Entry Point</a></li>\n<li><a href=\"#sections-defining-the-text-section\">SECTIONS: Defining the text Section</a></li>\n<li><a href=\"#sections-defining-the-rodata-section\">SECTIONS: Defining the rodata Section</a></li>\n<li><a href=\"#sections-defining-the-stab-and-stabstr-sections\">SECTIONS: Defining the stab and stabstr Sections</a></li>\n<li><a href=\"#sections-defining-the-data-section\">SECTIONS: Defining the data Section</a></li>\n<li><a href=\"#sections-defining-the-bss-section\">SECTIONS: Defining the bss Section</a></li>\n<li><a href=\"#sections-discard\">SECTIONS: DISCARD</a></li>\n</ul>\n</li>\n<li>\n<p><a href=\"#kernel-entry-point\">Kernel Entry Point</a></p>\n<ul>\n<li><a href=\"#multiboot-header\">Multiboot Header</a></li>\n<li><a href=\"#defining-the-physical-address-of-the-entry-point\">Defining the Physical Address of the Entry Point</a></li>\n<li><a href=\"#loading-the-kernel-entry-point\">Loading the Kernel Entry Point</a></li>\n<li><a href=\"#what-is-paging\">What Is Paging?</a></li>\n<li><a href=\"#setting-the-stack-pointer\">Setting the Stack Pointer</a></li>\n<li><a href=\"#transferring-to-the-main-function\">Transferring to the main Function</a></li>\n</ul>\n</li>\n<li><a href=\"#summary\">Summary</a></li>\n<li><a href=\"#references\">References</a></li>\n</ul>\n<h2 id=\"loading-the-kernel\" style=\"position:relative;\"><a href=\"#loading-the-kernel\" aria-label=\"loading the kernel permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Loading the Kernel</h2>\n<p>Let me first review where the kernel was being loaded during the bootstrap phase.</p>\n<p>The kernel was read into memory at address <code class=\"language-text\">0x10000</code>, as shown below.</p>\n<p>After that, the program headers were loaded and the <code class=\"language-text\">entry()</code> function was called, transferring control to the kernel.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token keyword\">void</span> <span class=\"token function\">bootmain</span><span class=\"token punctuation\">(</span><span class=\"token keyword\">void</span><span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">{</span>\n  <span class=\"token keyword\">struct</span> <span class=\"token class-name\">elfhdr</span> <span class=\"token operator\">*</span>elf<span class=\"token punctuation\">;</span>\n  <span class=\"token keyword\">struct</span> <span class=\"token class-name\">proghdr</span> <span class=\"token operator\">*</span>ph<span class=\"token punctuation\">,</span> <span class=\"token operator\">*</span>eph<span class=\"token punctuation\">;</span>\n  <span class=\"token keyword\">void</span> <span class=\"token punctuation\">(</span><span class=\"token operator\">*</span>entry<span class=\"token punctuation\">)</span><span class=\"token punctuation\">(</span><span class=\"token keyword\">void</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  uchar<span class=\"token operator\">*</span> pa<span class=\"token punctuation\">;</span>\n\n  elf <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span><span class=\"token keyword\">struct</span> <span class=\"token class-name\">elfhdr</span><span class=\"token operator\">*</span><span class=\"token punctuation\">)</span><span class=\"token number\">0x10000</span><span class=\"token punctuation\">;</span>  <span class=\"token comment\">// scratch space</span>\n\n  <span class=\"token comment\">// Read 1st page off disk</span>\n  <span class=\"token function\">readseg</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">(</span>uchar<span class=\"token operator\">*</span><span class=\"token punctuation\">)</span>elf<span class=\"token punctuation\">,</span> <span class=\"token number\">4096</span><span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n  <span class=\"token comment\">// Is this an ELF executable?</span>\n  <span class=\"token keyword\">if</span><span class=\"token punctuation\">(</span>elf<span class=\"token operator\">-></span>magic <span class=\"token operator\">!=</span> ELF_MAGIC<span class=\"token punctuation\">)</span>\n    <span class=\"token keyword\">return</span><span class=\"token punctuation\">;</span>  <span class=\"token comment\">// let bootasm.S handle error</span>\n\n  <span class=\"token comment\">// Load each program segment (ignores ph flags).</span>\n  ph <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span><span class=\"token keyword\">struct</span> <span class=\"token class-name\">proghdr</span><span class=\"token operator\">*</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">(</span>uchar<span class=\"token operator\">*</span><span class=\"token punctuation\">)</span>elf <span class=\"token operator\">+</span> elf<span class=\"token operator\">-></span>phoff<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  eph <span class=\"token operator\">=</span> ph <span class=\"token operator\">+</span> elf<span class=\"token operator\">-></span>phnum<span class=\"token punctuation\">;</span>\n  <span class=\"token keyword\">for</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">;</span> ph <span class=\"token operator\">&lt;</span> eph<span class=\"token punctuation\">;</span> ph<span class=\"token operator\">++</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">{</span>\n    pa <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span>uchar<span class=\"token operator\">*</span><span class=\"token punctuation\">)</span>ph<span class=\"token operator\">-></span>paddr<span class=\"token punctuation\">;</span>\n    <span class=\"token function\">readseg</span><span class=\"token punctuation\">(</span>pa<span class=\"token punctuation\">,</span> ph<span class=\"token operator\">-></span>filesz<span class=\"token punctuation\">,</span> ph<span class=\"token operator\">-></span>off<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n    <span class=\"token keyword\">if</span><span class=\"token punctuation\">(</span>ph<span class=\"token operator\">-></span>memsz <span class=\"token operator\">></span> ph<span class=\"token operator\">-></span>filesz<span class=\"token punctuation\">)</span>\n      <span class=\"token function\">stosb</span><span class=\"token punctuation\">(</span>pa <span class=\"token operator\">+</span> ph<span class=\"token operator\">-></span>filesz<span class=\"token punctuation\">,</span> <span class=\"token number\">0</span><span class=\"token punctuation\">,</span> ph<span class=\"token operator\">-></span>memsz <span class=\"token operator\">-</span> ph<span class=\"token operator\">-></span>filesz<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span>\n\n  <span class=\"token comment\">// Call the entry point from the ELF header.</span>\n  <span class=\"token comment\">// Does not return!</span>\n  entry <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span><span class=\"token keyword\">void</span><span class=\"token punctuation\">(</span><span class=\"token operator\">*</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">(</span><span class=\"token keyword\">void</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">(</span>elf<span class=\"token operator\">-></span>entry<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  <span class=\"token function\">entry</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p>So this time I want to start by tracking down the <code class=\"language-text\">entry()</code> function.</p>\n<h2 id=\"building-the-kernel-program\" style=\"position:relative;\"><a href=\"#building-the-kernel-program\" aria-label=\"building the kernel program permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Building the Kernel Program</h2>\n<p>Let me trace through how the kernel program is built.</p>\n<p>The final image file <code class=\"language-text\">xv6.img</code> is generated with the following commands.</p>\n<p><code class=\"language-text\">xv6.img</code> is produced by embedding <code class=\"language-text\">bootblock</code> and <code class=\"language-text\">kernel</code> into a <code class=\"language-text\">0x10000</code>-byte blank area.</p>\n<div class=\"gatsby-highlight\" data-language=\"bash\"><pre class=\"language-bash\"><code class=\"language-bash\">xv6.img: bootblock kernel\n<span class=\"token function\">dd</span> <span class=\"token assign-left variable\">if</span><span class=\"token operator\">=</span>/dev/zero <span class=\"token assign-left variable\">of</span><span class=\"token operator\">=</span>xv6.img <span class=\"token assign-left variable\">count</span><span class=\"token operator\">=</span><span class=\"token number\">10000</span>\n<span class=\"token function\">dd</span> <span class=\"token assign-left variable\">if</span><span class=\"token operator\">=</span>bootblock <span class=\"token assign-left variable\">of</span><span class=\"token operator\">=</span>xv6.img <span class=\"token assign-left variable\">conv</span><span class=\"token operator\">=</span>notrunc\n<span class=\"token function\">dd</span> <span class=\"token assign-left variable\">if</span><span class=\"token operator\">=</span>kernel <span class=\"token assign-left variable\">of</span><span class=\"token operator\">=</span>xv6.img <span class=\"token assign-left variable\">seek</span><span class=\"token operator\">=</span><span class=\"token number\">1</span> <span class=\"token assign-left variable\">conv</span><span class=\"token operator\">=</span>notrunc</code></pre></div>\n<p>We already traced <code class=\"language-text\">bootblock</code> in the previous article, so this time let’s focus on <code class=\"language-text\">kernel</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"bash\"><pre class=\"language-bash\"><code class=\"language-bash\">kernel: <span class=\"token variable\"><span class=\"token variable\">$(</span>OBJS<span class=\"token variable\">)</span></span> entry.o entryother initcode kernel.ld\n<span class=\"token variable\"><span class=\"token variable\">$(</span>LD<span class=\"token variable\">)</span></span> <span class=\"token variable\"><span class=\"token variable\">$(</span>LDFLAGS<span class=\"token variable\">)</span></span> -T kernel.ld -o kernel entry.o <span class=\"token variable\"><span class=\"token variable\">$(</span>OBJS<span class=\"token variable\">)</span></span> -b binary initcode entryother\n<span class=\"token variable\"><span class=\"token variable\">$(</span>OBJDUMP<span class=\"token variable\">)</span></span> -S kernel <span class=\"token operator\">></span> kernel.asm\n<span class=\"token variable\"><span class=\"token variable\">$(</span>OBJDUMP<span class=\"token variable\">)</span></span> -t kernel <span class=\"token operator\">|</span> <span class=\"token function\">sed</span> <span class=\"token string\">'1,/SYMBOL TABLE/d; s/ .* / /; /^$$/d'</span> <span class=\"token operator\">></span> kernel.sym</code></pre></div>\n<p>The dependencies for <code class=\"language-text\">kernel</code> are <code class=\"language-text\">$(OBJS) entry.o entryother initcode kernel.ld</code>.</p>\n<p>The list of <code class=\"language-text\">$(OBJS)</code> is quite long, so I will skip it. It includes kernel modules such as <code class=\"language-text\">main.o</code>.</p>\n<p>The last two lines only output the binary’s disassembly and symbol information; the actual binary is built by the line <code class=\"language-text\">$(LD) $(LDFLAGS) -T kernel.ld -o kernel entry.o $(OBJS) -b binary initcode entryother</code>.</p>\n<p><code class=\"language-text\">LD</code> is used in the form <code class=\"language-text\">$(TOOLPREFIX)ld</code>, just like the <code class=\"language-text\">GCC</code> described in the previous article.</p>\n<p>Since we are not cross-compiling this time, the plain <code class=\"language-text\">ld</code> command is used.</p>\n<div class=\"gatsby-highlight\" data-language=\"bash\"><pre class=\"language-bash\"><code class=\"language-bash\">LD <span class=\"token operator\">=</span> <span class=\"token variable\"><span class=\"token variable\">$(</span>TOOLPREFIX<span class=\"token variable\">)</span></span>ld\n\n<span class=\"token comment\"># FreeBSD ld wants ``elf_i386_fbsd''</span>\nLDFLAGS <span class=\"token operator\">+=</span> -m <span class=\"token variable\"><span class=\"token variable\">$(</span>shell <span class=\"token punctuation\">$(</span>LD<span class=\"token punctuation\">)</span> -V <span class=\"token operator\">|</span> <span class=\"token function\">grep</span> elf_i386 <span class=\"token operator\"><span class=\"token file-descriptor important\">2</span>></span>/dev/null <span class=\"token operator\">|</span> <span class=\"token function\">head</span> -n <span class=\"token number\">1</span><span class=\"token variable\">)</span></span></code></pre></div>\n<p><code class=\"language-text\">LDFLAGS</code> extracts <code class=\"language-text\">elf_i386</code> from the output of <code class=\"language-text\">ld -V</code> and passes it as the <code class=\"language-text\">-m elf_i386</code> option.</p>\n<p><code class=\"language-text\">ld -V</code> is the version-check command for <code class=\"language-text\">ld</code> with an option that lists the supported emulators.</p>\n<p>The actual command executed at build time looks like this:</p>\n<p>The <code class=\"language-text\">-T</code> option, like the <code class=\"language-text\">-c</code> option, reads link commands from a linker script (<code class=\"language-text\">kernel.ld</code>).</p>\n<p>The <code class=\"language-text\">-b</code> option specifies the binary format of the subsequently listed input object files; here <code class=\"language-text\">binary</code> is specified.</p>\n<p>The <code class=\"language-text\">initcode</code> and <code class=\"language-text\">entryother</code> that follow are binaries assembled from assembly files.</p>\n<div class=\"gatsby-highlight\" data-language=\"bash\"><pre class=\"language-bash\"><code class=\"language-bash\">ld -m elf_i386 -T kernel.ld -o kernel <span class=\"token punctuation\">\\</span>\nentry.o bio.o console.o exec.o file.o fs.o ide.o ioapic.o kalloc.o kbd.o lapic.o log.o main.o mp.o picirq.o pipe.o proc.o sleeplock.o spinlock.o string.o swtch.o syscall.o sysfile.o sysproc.o trapasm.o trap.o uart.o vectors.o vm.o  <span class=\"token punctuation\">\\</span>\n-b binary initcode entryother</code></pre></div>\n<p>Reference: <a href=\"https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_3.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">LD, the GNU linker - Options</a></p>\n<p>Next, let’s look inside the linker script <code class=\"language-text\">kernel.ld</code>.</p>\n<h3 id=\"linker-script\" style=\"position:relative;\"><a href=\"#linker-script\" aria-label=\"linker script permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Linker Script</h3>\n<p>First, what is a linker script? It is a file that specifies the memory layout of objects when the linker links object files to produce an executable.</p>\n<p>Normally the linker’s built-in default linker script is used, so you do not need to specify one explicitly.</p>\n<p>Incidentally, the default linker script built into the linker can be printed by running <code class=\"language-text\">ld</code> with the <code class=\"language-text\">--verbose</code> option.</p>\n<p>However, for programs like an OS or embedded systems where the general-purpose OS management facilities are unavailable, a custom linker script must be configured.</p>\n<p>Reference: <a href=\"https://sourceware.org/binutils/docs/ld/Scripts.html#Scripts\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Scripts (LD)</a></p>\n<p>Reference: <a href=\"https://sourceware.org/binutils/docs/ld/Basic-Script-Concepts.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Basic Script Concepts (LD)</a></p>\n<p>Reference: <a href=\"https://www.computex.co.jp/article/use_gcc_1.htm\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">GNU Cを使いこなそう | 株式会社コンピューテックス</a></p>\n<p>The full linker script used to build the xv6 OS kernel is as follows.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">/* Simple linker script for the JOS kernel.\n   See the GNU ld 'info' manual (\"info ld\") to learn the syntax. */</span>\n\n<span class=\"token function\">OUTPUT_FORMAT</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"elf32-i386\"</span><span class=\"token punctuation\">,</span> <span class=\"token string\">\"elf32-i386\"</span><span class=\"token punctuation\">,</span> <span class=\"token string\">\"elf32-i386\"</span><span class=\"token punctuation\">)</span>\n<span class=\"token function\">OUTPUT_ARCH</span><span class=\"token punctuation\">(</span>i386<span class=\"token punctuation\">)</span>\n<span class=\"token function\">ENTRY</span><span class=\"token punctuation\">(</span>_start<span class=\"token punctuation\">)</span>\n\nSECTIONS\n<span class=\"token punctuation\">{</span>\n<span class=\"token comment\">/* Link the kernel at this address: \".\" means the current address */</span>\n        <span class=\"token comment\">/* Must be equal to KERNLINK */</span>\n<span class=\"token punctuation\">.</span> <span class=\"token operator\">=</span> <span class=\"token number\">0x80100000</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token punctuation\">.</span>text <span class=\"token operator\">:</span> <span class=\"token function\">AT</span><span class=\"token punctuation\">(</span><span class=\"token number\">0x100000</span><span class=\"token punctuation\">)</span> <span class=\"token punctuation\">{</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>text <span class=\"token punctuation\">.</span>stub <span class=\"token punctuation\">.</span>text<span class=\"token punctuation\">.</span><span class=\"token operator\">*</span> <span class=\"token punctuation\">.</span>gnu<span class=\"token punctuation\">.</span>linkonce<span class=\"token punctuation\">.</span>t<span class=\"token punctuation\">.</span><span class=\"token operator\">*</span><span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">}</span>\n\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>etext <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span><span class=\"token comment\">/* Define the 'etext' symbol to this value */</span>\n\n<span class=\"token punctuation\">.</span>rodata <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>rodata <span class=\"token punctuation\">.</span>rodata<span class=\"token punctuation\">.</span><span class=\"token operator\">*</span> <span class=\"token punctuation\">.</span>gnu<span class=\"token punctuation\">.</span>linkonce<span class=\"token punctuation\">.</span>r<span class=\"token punctuation\">.</span><span class=\"token operator\">*</span><span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">}</span>\n\n<span class=\"token comment\">/* Include debugging information in kernel memory */</span>\n<span class=\"token punctuation\">.</span>stab <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>__STAB_BEGIN__ <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>stab<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>__STAB_END__ <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span>\n\n<span class=\"token punctuation\">.</span>stabstr <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>__STABSTR_BEGIN__ <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>stabstr<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>__STABSTR_END__ <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span>\n\n<span class=\"token comment\">/* Adjust the address for the data segment to the next page */</span>\n<span class=\"token punctuation\">.</span> <span class=\"token operator\">=</span> <span class=\"token function\">ALIGN</span><span class=\"token punctuation\">(</span><span class=\"token number\">0x1000</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token comment\">/* Conventionally, Unix linkers provide pseudo-symbols\n * etext, edata, and end, at the end of the text, data, and bss.\n * For the kernel mapping, we need the address at the beginning\n * of the data section, but that's not one of the conventional\n * symbols, because the convention started before there was a\n * read-only rodata section between text and data. */</span>\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>data <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token comment\">/* The data segment */</span>\n<span class=\"token punctuation\">.</span>data <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>data<span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">}</span>\n\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>edata <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token punctuation\">.</span>bss <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>bss<span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">}</span>\n\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>end <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token operator\">/</span>DISCARD<span class=\"token operator\">/</span> <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>eh_frame <span class=\"token punctuation\">.</span>note<span class=\"token punctuation\">.</span>GNU<span class=\"token operator\">-</span>stack<span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">}</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<h3 id=\"linker-script-structure\" style=\"position:relative;\"><a href=\"#linker-script-structure\" aria-label=\"linker script structure permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Linker Script Structure</h3>\n<p>The minimum required element in a linker script is the <code class=\"language-text\">SECTIONS</code> element.</p>\n<p>A <code class=\"language-text\">MEMORY</code> element is often defined, but it is not required.</p>\n<p>The <code class=\"language-text\">SECTIONS</code> element defines sections and places them at arbitrary addresses.</p>\n<p>Both physical addresses and virtual addresses can be defined for these addresses.</p>\n<p>Reference: <a href=\"https://www.computex.co.jp/article/use_gcc_1.htm\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Mastering GNU C | Computex Co., Ltd.</a></p>\n<p>Reference: <a href=\"http://blueeyes.sakura.ne.jp/2018/10/31/1676/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">リンカスクリプトの書き方</a></p>\n<p>The simplest linker script with only the <code class=\"language-text\">SECTIONS</code> element looks like the following example.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\">SECTIONS\n<span class=\"token punctuation\">{</span>\n  <span class=\"token punctuation\">.</span> <span class=\"token operator\">=</span> <span class=\"token number\">0x10000</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">.</span>text <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span> <span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>text<span class=\"token punctuation\">)</span> <span class=\"token punctuation\">}</span>\n  <span class=\"token punctuation\">.</span> <span class=\"token operator\">=</span> <span class=\"token number\">0x8000000</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">.</span>data <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span> <span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>data<span class=\"token punctuation\">)</span> <span class=\"token punctuation\">}</span>\n  <span class=\"token punctuation\">.</span>bss <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span> <span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>bss<span class=\"token punctuation\">)</span> <span class=\"token punctuation\">}</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p>Reference: <a href=\"https://sourceware.org/binutils/docs/ld/Simple-Example.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Simple Example (LD)</a></p>\n<p>In the xv6 OS linker script, the following sections are defined:</p>\n<ul>\n<li><code class=\"language-text\">.text</code> : Where the executable binary is placed. Typically read/execute permission only.</li>\n<li><code class=\"language-text\">.rodata</code> : Where read-only data is placed.</li>\n<li><code class=\"language-text\">.stab</code> : Where an array of fixed-length structures called stabs is placed.</li>\n<li><code class=\"language-text\">.stabstr</code> : Where variable-length strings referenced from stabs are placed.</li>\n<li><code class=\"language-text\">.data</code> : Where readable and writable data is placed.</li>\n<li><code class=\"language-text\">.bss</code> : Where block starting symbols (objects that are declared but have not yet been assigned a value) are placed.</li>\n</ul>\n<p>Reference: <a href=\"https://opensource.apple.com/source/gdb/gdb-292/doc/stabs.html/stabs_13.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">STABS - Using Stabs in Their Own Sections</a></p>\n<p>Reference: <a href=\"https://doc.ecoscentric.com/gnutools/doc/stabs/Stab-Section-Basics.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">STABS: Stab Section Basics</a></p>\n<p>Reference: <a href=\"https://en.wikipedia.org/wiki/.bss\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">.bss - Wikipedia</a></p>\n<p>Let me now walk through the contents of the linker script in order.</p>\n<h3 id=\"defining-the-entry-point\" style=\"position:relative;\"><a href=\"#defining-the-entry-point\" aria-label=\"defining the entry point permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Defining the Entry Point</h3>\n<p>Looking at the first lines of the linker script, three things are defined.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">/* Simple linker script for the JOS kernel.\n   See the GNU ld 'info' manual (\"info ld\") to learn the syntax. */</span>\n\n<span class=\"token function\">OUTPUT_FORMAT</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"elf32-i386\"</span><span class=\"token punctuation\">,</span> <span class=\"token string\">\"elf32-i386\"</span><span class=\"token punctuation\">,</span> <span class=\"token string\">\"elf32-i386\"</span><span class=\"token punctuation\">)</span>\n<span class=\"token function\">OUTPUT_ARCH</span><span class=\"token punctuation\">(</span>i386<span class=\"token punctuation\">)</span>\n<span class=\"token function\">ENTRY</span><span class=\"token punctuation\">(</span>_start<span class=\"token punctuation\">)</span></code></pre></div>\n<p><code class=\"language-text\">OUTPUT_FORMAT</code> defines the format of the output binary.</p>\n<p><code class=\"language-text\">OUTPUT_ARCH</code> specifies the architecture that the output binary targets.</p>\n<p><code class=\"language-text\">ENTRY</code> specifies the symbol name of the function that will be executed first.</p>\n<p>Reference: <a href=\"https://sourceware.org/binutils/docs/ld/Entry-Point.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Entry Point (LD)</a></p>\n<p>The <code class=\"language-text\">_start</code> specified here is defined in <code class=\"language-text\">entry.S</code> as follows.</p>\n<div class=\"gatsby-highlight\" data-language=\"assembly\"><pre class=\"language-assembly\"><code class=\"language-assembly\"># By convention, the _start symbol specifies the ELF entry point.\n# Since we haven&#39;t set up virtual memory yet, our entry point is\n# the physical address of &#39;entry&#39;.\n.globl _start\n_start = V2P_WO(entry)\n\n# Entering xv6 on boot processor, with paging off.\n.globl entry\nentry:\n  # Turn on page size extension for 4Mbyte pages\n  movl    %cr4, %eax\n  orl     $(CR4_PSE), %eax\n  movl    %eax, %cr4\n  # Set page directory\n  movl    $(V2P_WO(entrypgdir)), %eax\n  movl    %eax, %cr3\n  # Turn on paging.\n  movl    %cr0, %eax\n  orl     $(CR0_PG|CR0_WP), %eax\n  movl    %eax, %cr0\n\n  # Set up the stack pointer.\n  movl $(stack + KSTACKSIZE), %esp\n\n  # Jump to main(), and switch to executing at\n  # high addresses. The indirect call is needed because\n  # the assembler produces a PC-relative instruction\n  # for a direct jump.\n  mov $main, %eax\n  jmp *%eax\n\n.comm stack, KSTACKSIZE</code></pre></div>\n<p>The details of <code class=\"language-text\">entry.S</code> will be described later.</p>\n<p>Reference: <a href=\"https://yohei.codes/ja/post/xv6-memory-1/#kernelld\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">xv6: OSはどうメモリを参照、管理するのか（前編） - yohei.codes</a></p>\n<h3 id=\"sections-defining-the-text-section\" style=\"position:relative;\"><a href=\"#sections-defining-the-text-section\" aria-label=\"sections defining the text section permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>SECTIONS: Defining the text Section</h3>\n<p>First, let’s look at the part that defines the text section.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">/* Link the kernel at this address: \".\" means the current address */</span>\n<span class=\"token comment\">/* Must be equal to KERNLINK */</span>\n<span class=\"token punctuation\">.</span> <span class=\"token operator\">=</span> <span class=\"token number\">0x80100000</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token punctuation\">.</span>text <span class=\"token operator\">:</span> <span class=\"token function\">AT</span><span class=\"token punctuation\">(</span><span class=\"token number\">0x100000</span><span class=\"token punctuation\">)</span> <span class=\"token punctuation\">{</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>text <span class=\"token punctuation\">.</span>stub <span class=\"token punctuation\">.</span>text<span class=\"token punctuation\">.</span><span class=\"token operator\">*</span> <span class=\"token punctuation\">.</span>gnu<span class=\"token punctuation\">.</span>linkonce<span class=\"token punctuation\">.</span>t<span class=\"token punctuation\">.</span><span class=\"token operator\">*</span><span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">}</span>\n\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>etext <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span><span class=\"token comment\">/* Define the 'etext' symbol to this value */</span></code></pre></div>\n<p>The first line, <code class=\"language-text\">. = 0x80100000;</code>, sets the value of the special symbol <code class=\"language-text\">.</code>.</p>\n<p>This is used as the location counter.</p>\n<p>Sections defined afterward start at the address pointed to by the location counter.</p>\n<p>When a section is defined, the location counter is incremented by that section’s size.</p>\n<p>Reference: <a href=\"https://sourceware.org/binutils/docs/ld/Simple-Example.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Simple Example (LD)</a></p>\n<p>In xv6 OS, the initial value of the location counter is set to <code class=\"language-text\">0x80100000</code>.</p>\n<p>This means that the instruction addresses in the binary produced by the linker start from <code class=\"language-text\">0x80100000</code>.</p>\n<p>Section definitions use the following structure:</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\">section <span class=\"token punctuation\">[</span>address<span class=\"token punctuation\">]</span> <span class=\"token punctuation\">[</span><span class=\"token punctuation\">(</span>type<span class=\"token punctuation\">)</span><span class=\"token punctuation\">]</span> <span class=\"token operator\">:</span>\n  <span class=\"token punctuation\">[</span><span class=\"token function\">AT</span><span class=\"token punctuation\">(</span>lma<span class=\"token punctuation\">)</span><span class=\"token punctuation\">]</span>\n  <span class=\"token punctuation\">[</span><span class=\"token function\">ALIGN</span><span class=\"token punctuation\">(</span>section_align<span class=\"token punctuation\">)</span> <span class=\"token operator\">|</span> ALIGN_WITH_INPUT<span class=\"token punctuation\">]</span>\n  <span class=\"token punctuation\">[</span><span class=\"token function\">SUBALIGN</span><span class=\"token punctuation\">(</span>subsection_align<span class=\"token punctuation\">)</span><span class=\"token punctuation\">]</span>\n  <span class=\"token punctuation\">[</span>constraint<span class=\"token punctuation\">]</span>\n  <span class=\"token punctuation\">{</span>\n    output<span class=\"token operator\">-</span>section<span class=\"token operator\">-</span>command\n    output<span class=\"token operator\">-</span>section<span class=\"token operator\">-</span>command\n    …\n  <span class=\"token punctuation\">}</span> <span class=\"token punctuation\">[</span><span class=\"token operator\">></span>region<span class=\"token punctuation\">]</span> <span class=\"token punctuation\">[</span>AT<span class=\"token operator\">></span>lma_region<span class=\"token punctuation\">]</span> <span class=\"token punctuation\">[</span><span class=\"token operator\">:</span>phdr <span class=\"token operator\">:</span>phdr …<span class=\"token punctuation\">]</span> <span class=\"token punctuation\">[</span><span class=\"token operator\">=</span>fillexp<span class=\"token punctuation\">]</span> <span class=\"token punctuation\">[</span><span class=\"token punctuation\">,</span><span class=\"token punctuation\">]</span></code></pre></div>\n<p>Reference: <a href=\"https://sourceware.org/binutils/docs/ld/Output-Section-Description.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Output Section Description (LD)</a></p>\n<p><code class=\"language-text\">AT(0x100000)</code> defines the load address of the section as <code class=\"language-text\">0x100000</code>.</p>\n<p>Reference: <a href=\"https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_21.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Using LD, the GNU linker - Section Options</a></p>\n<p>To be honest, I had no idea at all what <code class=\"language-text\">*(.text .stub .text.* .gnu.linkonce.t.*)</code> was doing, but it appears to be a line that defines the contents of the section.</p>\n<p>There are several ways to define the contents, but the basic form is <code class=\"language-text\">filename(symbol)</code>.</p>\n<p>Multiple definitions can span multiple lines.</p>\n<p>When <code class=\"language-text\">*</code> is used in place of a filename, as in <code class=\"language-text\">*()</code>, all object files provided at link time are targeted.</p>\n<p>In other words, <code class=\"language-text\">*(.text .stub .text.* .gnu.linkonce.t.*)</code> is an instruction telling the linker to place the data from the <code class=\"language-text\">.text .stub .text.* .gnu.linkonce.t.*</code> sections of each input object file into the <code class=\"language-text\">.text</code> section of the output executable.</p>\n<p>Reference: <a href=\"https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_19.html#SEC19\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Using LD, the GNU linker - Section Placement</a></p>\n<p>Files such as <code class=\"language-text\">entry.o</code> and <code class=\"language-text\">bio.o</code> given as linker inputs are all compiled as 32-bit ELF format, so each of them has a header and a <code class=\"language-text\">.text</code> section.</p>\n<p>That is why the definition above is needed — to merge them all into a single executable.</p>\n<p>Once the <code class=\"language-text\">.text</code> section definition is complete, we need to define <code class=\"language-text\">etext</code> to mark the end of the segment.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>etext <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span><span class=\"token comment\">/* Define the 'etext' symbol to this value */</span></code></pre></div>\n<p>Reference: <a href=\"https://linuxjm.osdn.jp/html/LDP_man-pages/man3/end.3.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Man page of END</a></p>\n<p>Here, <code class=\"language-text\">PROVIDE</code> is used to set <code class=\"language-text\">etext</code> at the current location.</p>\n<p><code class=\"language-text\">PROVIDE</code> is a directive that creates a symbol only when that symbol is undefined in the code.</p>\n<p>Reference: <a href=\"https://sourceware.org/binutils/docs/ld/PROVIDE.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">PROVIDE (LD)</a></p>\n<h3 id=\"sections-defining-the-rodata-section\" style=\"position:relative;\"><a href=\"#sections-defining-the-rodata-section\" aria-label=\"sections defining the rodata section permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>SECTIONS: Defining the rodata Section</h3>\n<p>Next, the <code class=\"language-text\">.rodata</code> section is defined.</p>\n<p><code class=\"language-text\">rodata</code> stands for Read Only Data.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token punctuation\">.</span>rodata <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>rodata <span class=\"token punctuation\">.</span>rodata<span class=\"token punctuation\">.</span><span class=\"token operator\">*</span> <span class=\"token punctuation\">.</span>gnu<span class=\"token punctuation\">.</span>linkonce<span class=\"token punctuation\">.</span>r<span class=\"token punctuation\">.</span><span class=\"token operator\">*</span><span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p>The linker definition uses the same syntax as the <code class=\"language-text\">.text</code> section, so I will skip the explanation.</p>\n<h3 id=\"sections-defining-the-stab-and-stabstr-sections\" style=\"position:relative;\"><a href=\"#sections-defining-the-stab-and-stabstr-sections\" aria-label=\"sections defining the stab and stabstr sections permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>SECTIONS: Defining the stab and stabstr Sections</h3>\n<p>Next, the debug-purpose <code class=\"language-text\">stab</code> sections are defined.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">/* Include debugging information in kernel memory */</span>\n<span class=\"token punctuation\">.</span>stab <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>__STAB_BEGIN__ <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>stab<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>__STAB_END__ <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span>\n\n<span class=\"token punctuation\">.</span>stabstr <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>__STABSTR_BEGIN__ <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>stabstr<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>__STABSTR_END__ <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p><code class=\"language-text\">ld</code> does not create sections whose defined contents turn out to be empty.</p>\n<p>In the default code, each binary does not have a <code class=\"language-text\">.stab</code> section, so there was no <code class=\"language-text\">.stab</code> section in <code class=\"language-text\">kernel</code> either.</p>\n<p>However, I confirmed that adding <code class=\"language-text\">-gstabs</code> to the gcc compile options defined in the Makefile causes a <code class=\"language-text\">.stab</code> section to be created, and that section then appears in the linked <code class=\"language-text\">kernel</code> as well.</p>\n<h3 id=\"sections-defining-the-data-section\" style=\"position:relative;\"><a href=\"#sections-defining-the-data-section\" aria-label=\"sections defining the data section permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>SECTIONS: Defining the data Section</h3>\n<p>The <code class=\"language-text\">.data</code> section holds readable and writable data.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">/* Adjust the address for the data segment to the next page */</span>\n<span class=\"token punctuation\">.</span> <span class=\"token operator\">=</span> <span class=\"token function\">ALIGN</span><span class=\"token punctuation\">(</span><span class=\"token number\">0x1000</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token comment\">/* Conventionally, Unix linkers provide pseudo-symbols\n* etext, edata, and end, at the end of the text, data, and bss.\n* For the kernel mapping, we need the address at the beginning\n* of the data section, but that's not one of the conventional\n* symbols, because the convention started before there was a\n* read-only rodata section between text and data. */</span>\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>data <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token comment\">/* The data segment */</span>\n<span class=\"token punctuation\">.</span>data <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>data<span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">}</span>\n\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>edata <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span></code></pre></div>\n<p>First, the line <code class=\"language-text\">. = ALIGN(0x1000);</code> aligns the current location to a <code class=\"language-text\">0x1000</code> boundary.</p>\n<p>This line is not assigning a specific address to the location counter like <code class=\"language-text\">. = 0x80100000;</code> did earlier.</p>\n<p>Instead, <code class=\"language-text\">ALIGN</code> aligns the current location to the boundary of the specified value, starting from the current location at the time <code class=\"language-text\">ALIGN</code> is executed.</p>\n<p>Looking at the actual generated <code class=\"language-text\">kernel</code> binary, we can see that the binary data was continuous up to <code class=\"language-text\">0x80107aa9</code>, and then the starting address of the <code class=\"language-text\">.data</code> section becomes <code class=\"language-text\">0x80108000</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"bash\"><pre class=\"language-bash\"><code class=\"language-bash\">$ objdump -D kernel <span class=\"token operator\">|</span> <span class=\"token function\">grep</span> -5 <span class=\"token string\">\"Disassembly of section .data:\"</span>\n80107aa6:67 6e                outsb  %ds:<span class=\"token punctuation\">(</span>%si<span class=\"token punctuation\">)</span>,<span class=\"token punctuation\">(</span>%dx<span class=\"token punctuation\">)</span>\n80107aa8:65                   gs\n80107aa9:64                   fs\n<span class=\"token punctuation\">..</span>.\n\nDisassembly of section .data:\n\n<span class=\"token number\">80108000</span> <span class=\"token operator\">&lt;</span>ctlmap<span class=\"token operator\">></span>:\n<span class=\"token punctuation\">..</span>.\n<span class=\"token number\">80108010</span>:11 <span class=\"token number\">17</span>                adc    %edx,<span class=\"token punctuation\">(</span>%edi<span class=\"token punctuation\">)</span>\n<span class=\"token number\">80108012</span>:05 <span class=\"token number\">12</span> <span class=\"token number\">14</span> <span class=\"token number\">19</span> <span class=\"token number\">15</span>       <span class=\"token function\">add</span>    <span class=\"token variable\">$0x15191412</span>,%eax</code></pre></div>\n<p>This is the result of aligning to the <code class=\"language-text\">0x1000</code> boundary from the point where the current location counter had been incremented to <code class=\"language-text\">0x80107aaa</code>.</p>\n<p>Reference: <a href=\"https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_14.html#IDX239\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Using LD, the GNU linker - Arithmetic Functions</a></p>\n<p>The rest of the definitions are the same as those already covered, so I will skip them.</p>\n<h3 id=\"sections-defining-the-bss-section\" style=\"position:relative;\"><a href=\"#sections-defining-the-bss-section\" aria-label=\"sections defining the bss section permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>SECTIONS: Defining the bss Section</h3>\n<p>The <code class=\"language-text\">.bss</code> section is defined as follows.</p>\n<p>This is the same as what we have already seen, so I will skip the explanation.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>edata <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">.</span>bss <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>bss<span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">}</span>\n<span class=\"token function\">PROVIDE</span><span class=\"token punctuation\">(</span>end <span class=\"token operator\">=</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span></code></pre></div>\n<h3 id=\"sections-discard\" style=\"position:relative;\"><a href=\"#sections-discard\" aria-label=\"sections discard permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>SECTIONS: DISCARD</h3>\n<p>Sections listed under <code class=\"language-text\">/DISCARD/</code> are not linked into the generated object.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token operator\">/</span>DISCARD<span class=\"token operator\">/</span> <span class=\"token operator\">:</span> <span class=\"token punctuation\">{</span>\n<span class=\"token operator\">*</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">.</span>eh_frame <span class=\"token punctuation\">.</span>note<span class=\"token punctuation\">.</span>GNU<span class=\"token operator\">-</span>stack<span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p><code class=\"language-text\">.eh_frame</code> is a section generated by gcc that stores information for obtaining a stack backtrace.</p>\n<p><code class=\"language-text\">.note.GNU-stack</code> is used in Linux object files to declare stack attributes.</p>\n<h2 id=\"kernel-entry-point\" style=\"position:relative;\"><a href=\"#kernel-entry-point\" aria-label=\"kernel entry point permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Kernel Entry Point</h2>\n<p>Next, let’s look at the <code class=\"language-text\">_start</code> function that was defined as the kernel’s entry point at link time.</p>\n<p>The <code class=\"language-text\">entry.S</code> file where <code class=\"language-text\">_start</code> is defined contains the following code.</p>\n<div class=\"gatsby-highlight\" data-language=\"assembly\"><pre class=\"language-assembly\"><code class=\"language-assembly\"># The xv6 kernel starts executing in this file. This file is linked with\n# the kernel C code, so it can refer to kernel symbols such as main().\n# The boot block (bootasm.S and bootmain.c) jumps to entry below.\n        \n# Multiboot header, for multiboot boot loaders like GNU Grub.\n# http://www.gnu.org/software/grub/manual/multiboot/multiboot.html\n#\n# Using GRUB 2, you can boot xv6 from a file stored in a\n# Linux file system by copying kernel or kernelmemfs to /boot\n# and then adding this menu entry:\n#\n# menuentry &quot;xv6&quot; {\n# insmod ext2\n# set root=&#39;(hd0,msdos1)&#39;\n# set kernel=&#39;/boot/kernel&#39;\n# echo &quot;Loading ${kernel}...&quot;\n# multiboot ${kernel} ${kernel}\n# boot\n# }\n\n#include &quot;asm.h&quot;\n#include &quot;memlayout.h&quot;\n#include &quot;mmu.h&quot;\n#include &quot;param.h&quot;\n\n# Multiboot header.  Data to direct multiboot loader.\n.p2align 2\n.text\n.globl multiboot_header\nmultiboot_header:\n  #define magic 0x1badb002\n  #define flags 0\n  .long magic\n  .long flags\n  .long (-magic-flags)\n\n# By convention, the _start symbol specifies the ELF entry point.\n# Since we haven&#39;t set up virtual memory yet, our entry point is\n# the physical address of &#39;entry&#39;.\n.globl _start\n_start = V2P_WO(entry)\n\n# Entering xv6 on boot processor, with paging off.\n.globl entry\nentry:\n  # Turn on page size extension for 4Mbyte pages\n  movl    %cr4, %eax\n  orl     $(CR4_PSE), %eax\n  movl    %eax, %cr4\n  # Set page directory\n  movl    $(V2P_WO(entrypgdir)), %eax\n  movl    %eax, %cr3\n  # Turn on paging.\n  movl    %cr0, %eax\n  orl     $(CR0_PG|CR0_WP), %eax\n  movl    %eax, %cr0\n\n  # Set up the stack pointer.\n  movl $(stack + KSTACKSIZE), %esp\n\n  # Jump to main(), and switch to executing at\n  # high addresses. The indirect call is needed because\n  # the assembler produces a PC-relative instruction\n  # for a direct jump.\n  mov $main, %eax\n  jmp *%eax\n\n.comm stack, KSTACKSIZE</code></pre></div>\n<h3 id=\"multiboot-header\" style=\"position:relative;\"><a href=\"#multiboot-header\" aria-label=\"multiboot header permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Multiboot Header</h3>\n<p>Reading through <code class=\"language-text\">entry.S</code> from the top, we find the following code.</p>\n<p>First, <code class=\"language-text\">.p2align 2</code> on the first line aligns the binary to a 4-byte boundary.</p>\n<p>Reference: <a href=\"https://sourceware.org/binutils/docs/as/P2align.html#P2align\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">P2align (Using as)</a></p>\n<p>Reference: <a href=\"https://stackoverflow.com/questions/21546946/what-does-p2align-do-in-asm-code\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">gcc - What does .p2align do in asm code? - Stack Overflow</a></p>\n<p>Immediately under the <code class=\"language-text\">.text</code> directive, <code class=\"language-text\">multiboot_header</code> is defined.</p>\n<p>Here, the multiboot header is defined to support the multiboot specification.</p>\n<div class=\"gatsby-highlight\" data-language=\"assembly\"><pre class=\"language-assembly\"><code class=\"language-assembly\"># Multiboot header.  Data to direct multiboot loader.\n.p2align 2\n.text\n.globl multiboot_header\nmultiboot_header:\n  #define magic 0x1badb002\n  #define flags 0\n  .long magic\n  .long flags\n  .long (-magic-flags)</code></pre></div>\n<p>The multiboot specification standardizes how a bootloader loads an x86 operating system kernel.</p>\n<p>In the <a href=\"https://yukituna.com/3850/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">previous article</a>, I examined the xv6 OS bootloader code; if you want to boot the xv6 OS kernel with GRUB, for example, the kernel must comply with this multiboot specification.</p>\n<p>Bootloaders such as GRUB are adopted as the standard in Linux systems and others. (GRUB2 is normally used.)</p>\n<p>Reference: <a href=\"https://wiki2th.com/ja/Multiboot_Specification\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">マルチブート仕様</a></p>\n<p>Reference: <a href=\"https://wocota.hatenadiary.org/entry/20090607/1244389534\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">GRUBでOSを起動する - OSのようなもの</a></p>\n<p>Reference: <a href=\"https://inaz2.hatenablog.com/entry/2015/12/31/221319\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Trying to Run a Simple OS Kernel with GRUB - Momoiro Technology</a></p>\n<p>I am planning to actually try booting the xv6 OS with GRUB after I have finished reading through the kernel, so I will move on without tracing the code in detail.</p>\n<h3 id=\"defining-the-physical-address-of-the-entry-point\" style=\"position:relative;\"><a href=\"#defining-the-physical-address-of-the-entry-point\" aria-label=\"defining the physical address of the entry point permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Defining the Physical Address of the Entry Point</h3>\n<p>The next piece of code is as follows.</p>\n<div class=\"gatsby-highlight\" data-language=\"assembly\"><pre class=\"language-assembly\"><code class=\"language-assembly\"># By convention, the _start symbol specifies the ELF entry point.\n# Since we haven&#39;t set up virtual memory yet, our entry point is\n# the physical address of &#39;entry&#39;.\n.globl _start\n_start = V2P_WO(entry)</code></pre></div>\n<p>The <code class=\"language-text\">.globl</code> directive is a declaration that makes a symbol accessible from all linked files.</p>\n<p><code class=\"language-text\">_start</code> is the symbol that was referenced as the entry point from the linker script and elsewhere, and this declaration enables it to be called from outside <code class=\"language-text\">entry.S</code>.</p>\n<p>Reference: <a href=\"https://www.google.com/search?q=.globl&#x26;rlz=1C1GCEA_enJP959JP959&#x26;oq=.globl&#x26;aqs=chrome..69i57j0i512j0i10i512j0i10l7.483j0j7&#x26;sourceid=chrome&#x26;ie=UTF-8\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">.globl - Google Search</a></p>\n<p>Next, let’s look at the line <code class=\"language-text\">_start = V2P_WO(entry)</code>.</p>\n<p><code class=\"language-text\">V2P_WO</code> is the following macro defined in <code class=\"language-text\">memlayout.h</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"assembly\"><pre class=\"language-assembly\"><code class=\"language-assembly\">// Memory layout\n\n#define EXTMEM  0x100000            // Start of extended memory\n#define PHYSTOP 0xE000000           // Top physical memory\n#define DEVSPACE 0xFE000000         // Other devices are at high addresses\n\n// Key addresses for address space layout (see kmap in vm.c for layout)\n#define KERNBASE 0x80000000         // First kernel virtual address\n#define KERNLINK (KERNBASE+EXTMEM)  // Address where kernel is linked\n\n#define V2P(a) (((uint) (a)) - KERNBASE)\n#define P2V(a) ((void *)(((char *) (a)) + KERNBASE))\n\n#define V2P_WO(x) ((x) - KERNBASE)    // same as V2P, but without casts\n#define P2V_WO(x) ((x) + KERNBASE)    // same as P2V, but without casts</code></pre></div>\n<p>This is simply a macro that takes an address as an argument and subtracts <code class=\"language-text\">KERNBASE</code>, which is set to <code class=\"language-text\">0x80000000</code>.</p>\n<p>Originally, the linker had linked the kernel’s <code class=\"language-text\">.text</code> section using <code class=\"language-text\">0x80100000</code> as its base.</p>\n<p>This is a mechanism that separates the virtual memory ranges for user mode and kernel mode, allowing the CPU to load the kernel’s virtual addresses via x86 CPU paging.</p>\n<p>Reference: <a href=\"https://yohei.codes/ja/post/xv6-memory-1/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">xv6: OSはどうメモリを参照、管理するのか（前編） - yohei.codes</a></p>\n<p>However, at the point where <code class=\"language-text\">_start = V2P_WO(entry)</code> is executed, virtual memory has not yet been configured on the kernel side, so <code class=\"language-text\">0x80100000</code> is subtracted to assign the entry point <code class=\"language-text\">_start</code> to a physical address.</p>\n<h3 id=\"loading-the-kernel-entry-point\" style=\"position:relative;\"><a href=\"#loading-the-kernel-entry-point\" aria-label=\"loading the kernel entry point permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Loading the Kernel Entry Point</h3>\n<p>Let’s continue tracing the rest of <code class=\"language-text\">entry.S</code>.</p>\n<p>First, <code class=\"language-text\">.globl entry</code> makes <code class=\"language-text\">entry</code> a symbol accessible from outside.</p>\n<p>What <code class=\"language-text\">entry</code> does can be summarized simply: it loads the kernel’s virtual address using paging.</p>\n<p>When the <code class=\"language-text\">entry</code> label is called, the paging mechanism has not yet been enabled, so the first thing we do here is enable it.</p>\n<div class=\"gatsby-highlight\" data-language=\"assembly\"><pre class=\"language-assembly\"><code class=\"language-assembly\"># Entering xv6 on boot processor, with paging off.\n.globl entry\nentry:\n  # Turn on page size extension for 4Mbyte pages\n  movl    %cr4, %eax\n  orl     $(CR4_PSE), %eax\n  movl    %eax, %cr4\n  # Set page directory\n  movl    $(V2P_WO(entrypgdir)), %eax\n  movl    %eax, %cr3\n  # Turn on paging.\n  movl    %cr0, %eax\n  orl     $(CR0_PG|CR0_WP), %eax\n  movl    %eax, %cr0\n\n  # Set up the stack pointer.\n  movl $(stack + KSTACKSIZE), %esp\n\n  # Jump to main(), and switch to executing at\n  # high addresses. The indirect call is needed because\n  # the assembler produces a PC-relative instruction\n  # for a direct jump.\n  mov $main, %eax\n  jmp *%eax\n\n.comm stack, KSTACKSIZE</code></pre></div>\n<h3 id=\"what-is-paging\" style=\"position:relative;\"><a href=\"#what-is-paging\" aria-label=\"what is paging permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>What Is Paging?</h3>\n<p>Before tracing the code, let me briefly summarize what paging is.</p>\n<p>Paging is a method of managing memory by dividing it into fixed-size chunks called pages.</p>\n<p>This allows the divided memory regions to be treated as a linear address space. It also allows auxiliary storage devices such as SSDs to provide a virtual page area, making it possible to handle more memory than the physical RAM capacity.</p>\n<p>In paging, writing a page from main memory to auxiliary storage is called a “page-out,” while writing a page back from auxiliary storage to main memory is called a “page-in” or “swap-in.”</p>\n<p>Through the paging mechanism, unused memory regions are saved to auxiliary storage via page-out.</p>\n<p>The next time that memory region is needed, the OS raises an exception called a “page fault” for the address that does not exist in physical memory, and an interrupt triggers a swap-in to write the page back to physical memory.</p>\n<p>Reference: <a href=\"https://babyron64.hatenablog.com/entry/2017/12/22/210124\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">x86_64アーキテクチャ - ばびろん’s すたっく メモリアクセス</a></p>\n<p>Reference: <a href=\"https://babyron64.hatenablog.com/entry/2017/12/22/232423\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">x86_64アーキテクチャ - ばびろん’s すたっく メモリアクセス(続き)</a></p>\n<p>Reference: <a href=\"https://e-words.jp/w/%E3%83%9A%E3%83%BC%E3%82%B8%E3%83%B3%E3%82%B0.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">What Is Paging? - e-Words IT Dictionary</a></p>\n<p>To enable the paging mechanism on an x86 CPU, the PG flag of <code class=\"language-text\">CR0 (Control Register 0)</code> must be set to 1.</p>\n<p>Let’s look at the part that actually enables paging.</p>\n<p>In the <a href=\"https://yukituna.com/3850/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">previous article</a>, I set the PE flag of <code class=\"language-text\">CR0 (Control Register 0)</code> when transitioning to protected mode — the approach here is almost identical.</p>\n<div class=\"gatsby-highlight\" data-language=\"assembly\"><pre class=\"language-assembly\"><code class=\"language-assembly\">entry:\n  # Turn on page size extension for 4Mbyte pages\n  movl    %cr4, %eax\n  orl     $(CR4_PSE), %eax\n  movl    %eax, %cr4\n  # Set page directory\n  movl    $(V2P_WO(entrypgdir)), %eax\n  movl    %eax, %cr3\n  # Turn on paging.\n  movl    %cr0, %eax\n  orl     $(CR0_PG|CR0_WP), %eax\n  movl    %eax, %cr0</code></pre></div>\n<p>The processing after <code class=\"language-text\"># Turn on paging.</code> at the end is where the PG flag of <code class=\"language-text\">CR0 (Control Register 0)</code> is set.</p>\n<p>The constants used for the flag operations are each defined as follows.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">// Control Register flags</span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">CR0_PE</span>          <span class=\"token expression\"><span class=\"token number\">0x00000001</span>      </span><span class=\"token comment\">// Protection Enable</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">CR0_WP</span>          <span class=\"token expression\"><span class=\"token number\">0x00010000</span>      </span><span class=\"token comment\">// Write Protect</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">CR0_PG</span>          <span class=\"token expression\"><span class=\"token number\">0x80000000</span>      </span><span class=\"token comment\">// Paging</span></span>\n\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">CR4_PSE</span>         <span class=\"token expression\"><span class=\"token number\">0x00000010</span>      </span><span class=\"token comment\">// Page size extension</span></span></code></pre></div>\n<p>From this we can see that not only the PG flag but also the WP flag is being set.</p>\n<p>When the WP flag is set, the CPU can prevent ring-0 supervisor-level procedures from writing to read-only pages.</p>\n<p>This makes it easier to implement the copy-on-write mechanism when creating new processes in the OS.</p>\n<p>I will write about this in a future article.</p>\n<p>However, I do wonder why it is being set explicitly here, since the WP flag should be set by default on x86 CPUs.</p>\n<p>Reference: <a href=\"https://en.wikipedia.org/wiki/Control_register\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Control register - Wikipedia</a></p>\n<p>Reference: <a href=\"https://stackoverflow.com/questions/15275059/whats-the-purpose-of-x86-cr0-wp-bit\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">assembly - whats the purpose of x86 cr0 WP bit? - Stack Overflow</a></p>\n<p>Next, let’s look at the following code, which comes slightly before the CR0 setup.</p>\n<div class=\"gatsby-highlight\" data-language=\"assembly\"><pre class=\"language-assembly\"><code class=\"language-assembly\"># Turn on page size extension for 4Mbyte pages\nmovl    %cr4, %eax\norl     $(CR4_PSE), %eax\nmovl    %eax, %cr4</code></pre></div>\n<p>Here, the PSE flag of <code class=\"language-text\">CR4 (Control Register 4)</code> is being set.</p>\n<p>This flag controls the size of a single page.</p>\n<p>When the PSE flag of CR4 is not set (the default), the page size is 4 KiB.</p>\n<p>Conversely, when the PSE flag is set, the page size is extended to 4 MiB.</p>\n<p>Reference: <a href=\"https://en.wikipedia.org/wiki/Control_register\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Control register - Wikipedia</a></p>\n<p>I will cover the detailed background of why two page sizes exist in a separate article if the opportunity arises.</p>\n<p>We now know that xv6 OS uses a page size of 4 MiB.</p>\n<p>Finally, let’s look at the following:</p>\n<div class=\"gatsby-highlight\" data-language=\"assembly\"><pre class=\"language-assembly\"><code class=\"language-assembly\"># Set page directory\nmovl    $(V2P_WO(entrypgdir)), %eax\nmovl    %eax, %cr3</code></pre></div>\n<p>The paging mechanism in xv6 OS is enabled on the very next line, so paging is not yet active at this point.</p>\n<p>Therefore, the <code class=\"language-text\">$(V2P_WO(entrypgdir))</code> macro is used to convert the address of <code class=\"language-text\">entrypgdir</code> to a physical address before writing it to CR3.</p>\n<p>CR3 is a register used when the paging mechanism is active; the x86 CPU uses it to reference the page directory and page table and convert linear addresses to physical addresses.</p>\n<p><code class=\"language-text\">entrypgdir</code> is a struct array defined in <code class=\"language-text\">main.c</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">// main.c</span>\n<span class=\"token class-name\">pde_t</span> entrypgdir<span class=\"token punctuation\">[</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>  <span class=\"token comment\">// For entry.S</span>\n\n<span class=\"token comment\">// The boot page table used in entry.S and entryother.S.</span>\n<span class=\"token comment\">// Page directories (and page tables) must start on page boundaries,</span>\n<span class=\"token comment\">// hence the __aligned__ attribute.</span>\n<span class=\"token comment\">// PTE_PS in a page directory entry enables 4Mbyte pages.</span>\n\n<span class=\"token keyword\">__attribute__</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">(</span><span class=\"token function\">__aligned__</span><span class=\"token punctuation\">(</span>PGSIZE<span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span>\n<span class=\"token class-name\">pde_t</span> entrypgdir<span class=\"token punctuation\">[</span>NPDENTRIES<span class=\"token punctuation\">]</span> <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n  <span class=\"token comment\">// Map VA's [0, 4MB) to PA's [0, 4MB)</span>\n  <span class=\"token punctuation\">[</span><span class=\"token number\">0</span><span class=\"token punctuation\">]</span> <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span><span class=\"token number\">0</span><span class=\"token punctuation\">)</span> <span class=\"token operator\">|</span> PTE_P <span class=\"token operator\">|</span> PTE_W <span class=\"token operator\">|</span> PTE_PS<span class=\"token punctuation\">,</span>\n   \n  <span class=\"token comment\">// Map VA's [KERNBASE, KERNBASE+4MB) to PA's [0, 4MB)</span>\n  <span class=\"token punctuation\">[</span>KERNBASE<span class=\"token operator\">>></span>PDXSHIFT<span class=\"token punctuation\">]</span> <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span><span class=\"token number\">0</span><span class=\"token punctuation\">)</span> <span class=\"token operator\">|</span> PTE_P <span class=\"token operator\">|</span> PTE_W <span class=\"token operator\">|</span> PTE_PS<span class=\"token punctuation\">,</span>\n<span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span></code></pre></div>\n<p>The array size is <code class=\"language-text\">NPDENTRIES</code>, which is defined in <code class=\"language-text\">mmu.h</code> as 1024:</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token comment\">// Page directory and page table constants.</span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">NPDENTRIES</span>      <span class=\"token expression\"><span class=\"token number\">1024</span>    </span><span class=\"token comment\">// # directory entries per page directory</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">NPTENTRIES</span>      <span class=\"token expression\"><span class=\"token number\">1024</span>    </span><span class=\"token comment\">// # PTEs per page table</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">PGSIZE</span>          <span class=\"token expression\"><span class=\"token number\">4096</span>    </span><span class=\"token comment\">// bytes mapped by a page</span></span></code></pre></div>\n<p><code class=\"language-text\">entrypgdir</code> has two elements.</p>\n<p>Honestly, I only have a rough sense of what is happening here, but it appears to simply initialize the page directory entries.</p>\n<p>First, the line <code class=\"language-text\">(0) | PTE_P | PTE_W | PTE_PS</code>, which is common to both elements, defines the following:</p>\n<ul>\n<li><code class=\"language-text\">0</code> — set all bits to 0</li>\n<li><code class=\"language-text\">PTE_P</code> — set present</li>\n<li><code class=\"language-text\">PTE_W</code> — set read/write</li>\n<li><code class=\"language-text\">PTE_PS</code> — set the 4 MiB page size bit</li>\n</ul>\n<p>The first element, <code class=\"language-text\">[0] = (0) | PTE_P | PTE_W | PTE_PS,</code>, initializes the 0th page directory entry to this value.</p>\n<p>The next element initializes the <code class=\"language-text\">KERNBASE>>PDXSHIFT</code> = <code class=\"language-text\">0x80000000 >> 22</code> = 512nd page directory entry to this value.</p>\n<p>This initialization appears to be used when paging is subsequently enabled and execution transfers to the main function.</p>\n<p>Reference: <a href=\"https://yohei.codes/ja/post/xv6-memory-1/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">xv6: OSはどうメモリを参照、管理するのか（前編） - yohei.codes</a></p>\n<p>Reference: <a href=\"https://stackoverflow.com/questions/58576065/what-does-this-code-mean-in-xv6-entrypgdir\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">what does this code mean in xv6 entrypgdir? - Stack Overflow</a></p>\n<h3 id=\"setting-the-stack-pointer\" style=\"position:relative;\"><a href=\"#setting-the-stack-pointer\" aria-label=\"setting the stack pointer permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Setting the Stack Pointer</h3>\n<p>Finally, the stack pointer is set up before transferring to the main function.</p>\n<div class=\"gatsby-highlight\" data-language=\"assembly\"><pre class=\"language-assembly\"><code class=\"language-assembly\"># Set up the stack pointer.\nmovl $(stack + KSTACKSIZE), %esp</code></pre></div>\n<p><code class=\"language-text\">KSTACKSIZE</code> is defined as 4096 in <code class=\"language-text\">param.h</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">NPROC</span>        <span class=\"token expression\"><span class=\"token number\">64</span>  </span><span class=\"token comment\">// maximum number of processes</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">KSTACKSIZE</span> <span class=\"token expression\"><span class=\"token number\">4096</span>  </span><span class=\"token comment\">// size of per-process kernel stack</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">NCPU</span>          <span class=\"token expression\"><span class=\"token number\">8</span>  </span><span class=\"token comment\">// maximum number of CPUs</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">NOFILE</span>       <span class=\"token expression\"><span class=\"token number\">16</span>  </span><span class=\"token comment\">// open files per process</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">NFILE</span>       <span class=\"token expression\"><span class=\"token number\">100</span>  </span><span class=\"token comment\">// open files per system</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">NINODE</span>       <span class=\"token expression\"><span class=\"token number\">50</span>  </span><span class=\"token comment\">// maximum number of active i-nodes</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">NDEV</span>         <span class=\"token expression\"><span class=\"token number\">10</span>  </span><span class=\"token comment\">// maximum major device number</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">ROOTDEV</span>       <span class=\"token expression\"><span class=\"token number\">1</span>  </span><span class=\"token comment\">// device number of file system root disk</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">MAXARG</span>       <span class=\"token expression\"><span class=\"token number\">32</span>  </span><span class=\"token comment\">// max exec arguments</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">MAXOPBLOCKS</span>  <span class=\"token expression\"><span class=\"token number\">10</span>  </span><span class=\"token comment\">// max # of blocks any FS op writes</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">LOGSIZE</span>      <span class=\"token expression\"><span class=\"token punctuation\">(</span>MAXOPBLOCKS<span class=\"token operator\">*</span><span class=\"token number\">3</span><span class=\"token punctuation\">)</span>  </span><span class=\"token comment\">// max data blocks in on-disk log</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">NBUF</span>         <span class=\"token expression\"><span class=\"token punctuation\">(</span>MAXOPBLOCKS<span class=\"token operator\">*</span><span class=\"token number\">3</span><span class=\"token punctuation\">)</span>  </span><span class=\"token comment\">// size of disk block cache</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span><span class=\"token directive keyword\">define</span> <span class=\"token macro-name\">FSSIZE</span>       <span class=\"token expression\"><span class=\"token number\">1000</span>  </span><span class=\"token comment\">// size of file system in blocks</span></span></code></pre></div>\n<p>Setting the stack pointer is necessary to transfer to C code, but I honestly could not fully understand this part.</p>\n<p>The reason is that the variable <code class=\"language-text\">stack</code> is defined in <code class=\"language-text\">main.c</code>, and at this point no value has been stored in it yet.</p>\n<p>As a result, it appears to be defined as a <code class=\"language-text\">.comm</code> symbol, intended to be redefined later.</p>\n<p>Reference: <a href=\"https://stackoverflow.com/questions/29008035/assembly-mov-unitialized-variable\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">c - assembly - mov unitialized variable? - Stack Overflow</a></p>\n<p>Quite tricky…</p>\n<h3 id=\"transferring-to-the-main-function\" style=\"position:relative;\"><a href=\"#transferring-to-the-main-function\" aria-label=\"transferring to the main function permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Transferring to the main Function</h3>\n<p>The series of processing that began during bootstrap finally comes to an end here, and execution transfers to the <code class=\"language-text\">main.c</code> function, which is the kernel body.</p>\n<div class=\"gatsby-highlight\" data-language=\"c\"><pre class=\"language-c\"><code class=\"language-c\"><span class=\"token macro property\"><span class=\"token directive-hash\">#</span> <span class=\"token expression\">Jump to <span class=\"token function\">main</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> and <span class=\"token keyword\">switch</span> to executing at</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span> <span class=\"token directive keyword\">high</span> <span class=\"token expression\">addresses<span class=\"token punctuation\">.</span> The indirect call is needed because</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span> <span class=\"token directive keyword\">the</span> <span class=\"token expression\">assembler produces a PC<span class=\"token operator\">-</span>relative instruction</span></span>\n<span class=\"token macro property\"><span class=\"token directive-hash\">#</span> <span class=\"token directive keyword\">for</span> <span class=\"token expression\">a direct jump<span class=\"token punctuation\">.</span></span></span>\nmov $main<span class=\"token punctuation\">,</span> <span class=\"token operator\">%</span>eax\njmp <span class=\"token operator\">*</span><span class=\"token operator\">%</span>eax\n\n<span class=\"token punctuation\">.</span>comm stack<span class=\"token punctuation\">,</span> KSTACKSIZE</code></pre></div>\n<p>This has gotten quite long, so I will continue in the next article.</p>\n<h2 id=\"summary\" style=\"position:relative;\"><a href=\"#summary\" aria-label=\"summary permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Summary</h2>\n<p>In this article, I traced through the kernel program build process, the linker script, and the flow of execution at the entry point.</p>\n<p>Next time, we should finally be able to trace the behavior of the kernel body itself.</p>\n<h2 id=\"references\" style=\"position:relative;\"><a href=\"#references\" aria-label=\"references permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>References</h2>\n<ul>\n<li><a href=\"https://amzn.to/3qZSCY7\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">30日でできる! OS自作入門</a></li>\n<li><a href=\"https://amzn.to/3qXYsZX\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">ゼロからのOS自作入門</a></li>\n<li><a href=\"https://amzn.to/3q8TU3K\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">はじめてのOSコードリーディング ~UNIX V6で学ぶカーネルのしくみ</a></li>\n<li><a href=\"https://amzn.to/3I6fkVt\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">詳解 Linuxカーネル</a></li>\n<li><a href=\"https://amzn.to/3JRUdI2\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">作って理解するOS x86系コンピュータを動かす理論と実装</a></li>\n</ul>","fields":{"slug":"/unix-xv6-002-load-kernel-en","tagSlugs":["/tag/unix-en/","/tag/xv-6-en/","/tag/kernel-en/","/tag/os-en/","/tag/english/"]},"frontmatter":{"date":"2022-01-16","description":"Reading the source code of the educational xv6 OS to learn about the kernel. This article traces how the xv6 kernel is loaded — covering the linker script and paging setup.","tags":["Unix (en)","xv6 (en)","Kernel (en)","OS (en)","English"],"title":"Seriously Reading the xv6 OS to Fully Understand the Kernel — Linker and Paging Edition","socialImage":{"publicURL":"/static/c3c9f1c2cccec8bc4489b3fb915130c5/unix-xv6-002-load-kernel.png"}}}},"pageContext":{"slug":"/unix-xv6-002-load-kernel-en"}},"staticQueryHashes":["251939775","401334301","825871152"]}