This page has been machine-translated from the original page.
Inspired by An Introduction to OS Code Reading: Learning Kernel Internals with UNIX V6, I’m reading xv6 OS.
Because UNIX V6 itself does not run on x86 CPUs, I decided to read the source of kash1064/xv6-public: xv6 OS, a fork of the xv6 OS repository that makes UNIX V6 run on the x86 architecture.
In the previous article, I looked at how the mpinit function retrieves CPU information in a multiprocessor configuration.
This time, I will trace the behavior of the lapicinit function.
Table of Contents
The lapicinit function
This time I will start from lapicinit, the first function called in main.
This function initializes the interrupt controller.
lapicinit(); // interrupt controllerThe lapicinit function is defined in lapic.c as follows.
void lapicinit(void)
{
if(!lapic) return;
// Enable local APIC; set spurious interrupt vector.
lapicw(SVR, ENABLE | (T_IRQ0 + IRQ_SPURIOUS));
// The timer repeatedly counts down at bus frequency
// from lapic[TICR] and then issues an interrupt.
// If xv6 cared more about precise timekeeping,
// TICR would be calibrated using an external time source.
lapicw(TDCR, X1);
lapicw(TIMER, PERIODIC | (T_IRQ0 + IRQ_TIMER));
lapicw(TICR, 10000000);
// Disable logical interrupt lines.
lapicw(LINT0, MASKED);
lapicw(LINT1, MASKED);
// Disable performance counter overflow interrupts
// on machines that provide that interrupt entry.
if(((lapic[VER]>>16) & 0xFF) >= 4) lapicw(PCINT, MASKED);
// Map error interrupt to IRQ_ERROR.
lapicw(ERROR, T_IRQ0 + IRQ_ERROR);
// Clear error status register (requires back-to-back writes).
lapicw(ESR, 0);
lapicw(ESR, 0);
// Ack any outstanding interrupts.
lapicw(EOI, 0);
// Send an Init Level De-Assert to synchronise arbitration ID's.
lapicw(ICRHI, 0);
lapicw(ICRLO, BCAST | INIT | LEVEL);
while(lapic[ICRLO] & DELIVS)
;
// Enable interrupts on the APIC (but not on the processor).
lapicw(TPR, 0);
}The line if(!lapic) return; checks whether the global variable lapic holds a value.
Local APIC registers
The variable lapic had the address from lapicaddr inside the MP Floating Pointer Structure stored into it during the MP table retrieval covered in the previous article.
Let’s confirm the actual value stored in this global variable using a debugger.
$ b *0x801027a0
$ continueAfter inspecting the contents of lapic, the address 0xfee00000 was stored there.
$ info variables lapic
File lapic.c:
44:volatile uint *lapic;
$ p lapic
$1 = (volatile uint *) 0xfee00000This lapic is the memory-mapped Local APIC register.
The Local APIC register is 32-bit data that is memory-mapped at the address pointed to by the MP Configuration Table.
Each 32-bit element at an offset aligned to a 16-byte boundary is set as a Local APIC register.
The following page is helpful for further details.
Reference: APIC - OSDev Wiki
From here, we will configure the Local APIC registers.
The following values defined in lapic.c are used for this purpose.
// Local APIC registers, divided by 4 for use as uint[] indices.
#define ID (0x0020/4) // ID
#define VER (0x0030/4) // Version
#define TPR (0x0080/4) // Task Priority
#define EOI (0x00B0/4) // EOI
#define SVR (0x00F0/4) // Spurious Interrupt Vector
#define ENABLE 0x00000100 // Unit Enable
#define ESR (0x0280/4) // Error Status
#define ICRLO (0x0300/4) // Interrupt Command
#define INIT 0x00000500 // INIT/RESET
#define STARTUP 0x00000600 // Startup IPI
#define DELIVS 0x00001000 // Delivery status
#define ASSERT 0x00004000 // Assert interrupt (vs deassert)
#define DEASSERT 0x00000000
#define LEVEL 0x00008000 // Level triggered
#define BCAST 0x00080000 // Send to all APICs, including self.
#define BUSY 0x00001000
#define FIXED 0x00000000
#define ICRHI (0x0310/4) // Interrupt Command [63:32]
#define TIMER (0x0320/4) // Local Vector Table 0 (TIMER)
#define X1 0x0000000B // divide counts by 1
#define PERIODIC 0x00020000 // Periodic
#define PCINT (0x0340/4) // Performance Counter LVT
#define LINT0 (0x0350/4) // Local Vector Table 1 (LINT0)
#define LINT1 (0x0360/4) // Local Vector Table 2 (LINT1)
#define ERROR (0x0370/4) // Local Vector Table 3 (ERROR)
#define MASKED 0x00010000 // Interrupt masked
#define TICR (0x0380/4) // Timer Initial Count
#define TCCR (0x0390/4) // Timer Current Count
#define TDCR (0x03E0/4) // Timer Divide ConfigurationSetting the Spurious Interrupt Vector and enabling the Local APIC
Moving on.
This line uses the lapicw function to set the Spurious Interrupt Vector and enable the Local APIC.
// Enable local APIC; set spurious interrupt vector.
lapicw(SVR, ENABLE | (T_IRQ0 + IRQ_SPURIOUS));The lapicw function, used frequently from here on, is defined as follows.
static void
lapicw(int index, int value)
{
lapic[index] = value;
lapic[ID]; // wait for write to finish, by reading
}It takes index and value as arguments and overwrites the corresponding lapic value.
ID in lapic[ID] is defined as (0x0020/4) in lapic.c.
lapic[ID]; does not change any settings; its purpose is to wait for the preceding lapic write to complete by reading this value.
Let’s look at the line lapicw(SVR, ENABLE | (T_IRQ0 + IRQ_SPURIOUS));.
The index is SVR.
This refers to the offset (0x00F0/4) of the Spurious Interrupt Vector Register.
As described in the following article, setting bit 8 (0x100) of the Spurious Interrupt Vector Register enables the APIC.
Reference: APIC - OSDev Wiki
Also, in order for the Local APIC to be able to receive interrupts, the Spurious Interrupt Vector must be set.
The lower 8 bits of the Spurious Interrupt Vector Register are mapped to the IRQ number of the Spurious Interrupt Vector.
Therefore, 0x13f, the OR of 0x100 and 0x3f, is set in the Spurious Interrupt Vector Register.
This lower value 0x3f comes from the expression T_IRQ0 + IRQ_SPURIOUS.
The values such as T_IRQ0 are defined in traps.h.
// These are arbitrarily chosen, but with care not to overlap
// processor defined exceptions or interrupt vectors.
#define T_SYSCALL 64 // system call
#define T_DEFAULT 500 // catchall
#define T_IRQ0 32 // IRQ 0 corresponds to int T_IRQ
#define IRQ_TIMER 0
#define IRQ_KBD 1
#define IRQ_COM1 4
#define IRQ_IDE 14
#define IRQ_ERROR 19
#define IRQ_SPURIOUS 31According to the OSDev Wiki, the simplest value to set for the Spurious Interrupt Vector is 0xff, but xv6OS uses 0x1f.
(I don’t really understand the reason for this…)
Timer configuration
Next are the following lines.
// The timer repeatedly counts down at bus frequency
// from lapic[TICR] and then issues an interrupt.
// If xv6 cared more about precise timekeeping,
// TICR would be calibrated using an external time source.
lapicw(TDCR, X1);
lapicw(TIMER, PERIODIC | (T_IRQ0 + IRQ_TIMER));
lapicw(TICR, 10000000);TDCR is (0x03E0/4), which is the offset of the Divide Configuration Register (for Timer).
This setting is used when configuring the Local APIC timer period.
The Local APIC timer is a timer built into each processor’s Local APIC, and it generates interrupts only for that processor.
The Local APIC timer starts from a value set by the kernel as the Timer Initial Count, decrements at a fixed interval, and fires an interrupt when it reaches 0.
The decrement speed depends on the CPU’s bus frequency divided by the Timer Divide Configuration value.
The Local APIC timer supports two modes: Periodic mode and One-shot mode.
In Periodic mode, when the count reaches 0, it automatically resets to the initial value and begins counting down again.
In One-shot mode, when the count reaches 0 and fires an interrupt, the count stays at 0 until the program explicitly sets a new initial value.
Reference: APIC timer - OSDev Wiki
Reference: Timer interrupt selected as Local APIC timer interrupt - ZDNet Japan
In xv6OS, the Timer Divide Configuration value is set to 0x0000000B.
Reference image: Intel SDM vol3
Also, the Timer Initial Count is set to 10000000 by lapicw(TICR, 10000000);.
Incidentally, the following line configures Local Vector Table 0 (TIMER).
lapicw(TIMER, PERIODIC | (T_IRQ0 + IRQ_TIMER));The Local Vector Table (LVT) itself is a table in the Local APIC that maps events to interrupt vectors.
Through the LVT, software can specify how local interrupts are sent to the CPU.
Reference image: Intel SDM vol3
The LVT consists of the following 32-bit registers. Details will be examined when they are actually used.
- LVT Timer Register (FEE0 0320H)
- LVT Thermal Monitor Register (FEE0 0330H)
- LVT Performance Counter Register (FEE0 0340H)
- LVT LINT0 Register (FEE0 0350H)
- LVT LINT1 Register (FEE0 0360H)
- LVT Error Register (FEE0 0370H)
Here, TIMER specifies the LVT Timer Register at (0x0320/4).
The LVT Timer Register specifies the interrupt to raise when the APIC timer fires.
Here, PERIODIC sets bit 18 (Timer Mode) of the LVT to 1, switching the timer to Periodic mode.
Also, T_IRQ0 + IRQ_TIMER sets the interrupt vector; the value being set appears to be 0x20.
(I couldn’t find a direct source for this, but it probably refers to a timer interrupt like INT 0x20 — I’m not sure.)
Disable logical interrupt lines
Moving on.
0x00010000 is set at (0x0350/4) and (0x0360/4) respectively.
// Disable logical interrupt lines.
lapicw(LINT0, MASKED);
lapicw(LINT1, MASKED);First, (0x0350/4) and (0x0360/4) are the LVT LINT0 Register and LVT LINT1 Register.
Both are interrupt vectors that define interrupts from the LINT0 and LINT1 pins.
Here, bit 17 is set to enable the mask.
(I had no idea why it was necessary to disable these two — why?!)
Disable performance counter overflow interrupts
Moving on.
// Disable performance counter overflow interrupts
// on machines that provide that interrupt entry.
if(((lapic[VER]>>16) & 0xFF) >= 4) lapicw(PCINT, MASKED);PCINT masks the LVT Performance Counter Register, which generates an interrupt on overflow, under a specific condition.
That condition is that the value obtained by right-shifting the Local APIC Version Register by 16 bits and taking the lower 8 bits is 4 or greater.
I had no idea what this means…
In any case, when I ran through this in the debugger, this line was not executed, so I will proceed on the assumption that xv6OS does not mask the LVT Performance Counter Register.
Setting the Error Register
The interrupt vector for the LVT Error Register is set here.
// Map error interrupt to IRQ_ERROR.
lapicw(ERROR, T_IRQ0 + IRQ_ERROR);Clearing the ESR
The ESR is cleared here.
// Clear error status register (requires back-to-back writes).
lapicw(ESR, 0);
lapicw(ESR, 0);ESR stands for Error Status Register; bits are set when an error occurs.
The corresponding bits are shown in the following diagram.
Reference image: Intel SDM vol3
Why it is cleared twice is, once again, a mystery.
Checking the EOI
EOI stands for End of Interrupt, a signal sent to the PIC to indicate that a specific interrupt handler has completed.
// Ack any outstanding interrupts.
lapicw(EOI, 0);Reference: End of interrupt - Wikipedia
For compatibility reasons, 0 must be set in the EOI register.
Setting the Interrupt Command Register
Both ICRHI and ICRLO are Interrupt Command Registers.
These two registers are used to send interrupts to CPUs.
Note that writing to (0x0300/4) triggers an interrupt, but writing to (0x0310/4) does not, so values are set in the order ICRHI then ICRLO.
// Send an Init Level De-Assert to synchronise arbitration ID's.
lapicw(ICRHI, 0);
lapicw(ICRLO, BCAST | INIT | LEVEL);
while(lapic[ICRLO] & DELIVS);Here, an INIT Level De-assert message is sent to all Local APICs in the system to synchronize and set their arbitration IDs.
Reference image: Intel SDM vol3
Finally, setting the TPR (Task Priority Register) to 0 allows the CPU to handle all interrupts, enabling the interrupt functionality.
// Enable interrupts on the APIC (but not on the processor).
lapicw(TPR, 0);Incidentally, setting the TPR to 15 disables all interrupts.
Summary
With this, all processing inside the lapicinit function called from main is complete.
There were many parts I didn’t fully understand this time, so I plan to add notes as I figure things out.
Next time: segment descriptors at last.
It’s becoming more and more interesting as the inner workings of the kernel start to come into focus.