What is the difference between user and kernel modes in operating systems?
up vote
88
down vote
favorite
What are the differences between User Mode and Kernel Mode, why and how do you activate either of them, and what are their use cases?
operating-system
add a comment |
up vote
88
down vote
favorite
What are the differences between User Mode and Kernel Mode, why and how do you activate either of them, and what are their use cases?
operating-system
1
Possible duplicate of What is the difference between the kernel space and the user space?
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 11 '17 at 8:00
1
@CiroSantilli709大抓捕六四事件法轮功 a question which was asked 7 years ago can't be a closed as a duplicate for a question asked 6 years ago. If they are really duplicates, the closure should be other way around.
– Salvador Dali
Jun 19 '17 at 5:35
1
@SalvadorDali hi, current consensus is to close by "quality": meta.stackexchange.com/questions/147643/… Since "quality" is not measurable, I just go by upvotes. ;-) Likely it comes down to which question hit the best newb Google keywords on the title. I encourage you to simply copy your answer there with a disclaimer added at the bottom, and link from this one, in case it closes.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 19 '17 at 7:04
add a comment |
up vote
88
down vote
favorite
up vote
88
down vote
favorite
What are the differences between User Mode and Kernel Mode, why and how do you activate either of them, and what are their use cases?
operating-system
What are the differences between User Mode and Kernel Mode, why and how do you activate either of them, and what are their use cases?
operating-system
operating-system
edited Feb 16 at 15:17
Ciro Santilli 新疆改造中心 六四事件 法轮功
130k27510441
130k27510441
asked Aug 21 '09 at 11:22
Alex
32.6k68232322
32.6k68232322
1
Possible duplicate of What is the difference between the kernel space and the user space?
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 11 '17 at 8:00
1
@CiroSantilli709大抓捕六四事件法轮功 a question which was asked 7 years ago can't be a closed as a duplicate for a question asked 6 years ago. If they are really duplicates, the closure should be other way around.
– Salvador Dali
Jun 19 '17 at 5:35
1
@SalvadorDali hi, current consensus is to close by "quality": meta.stackexchange.com/questions/147643/… Since "quality" is not measurable, I just go by upvotes. ;-) Likely it comes down to which question hit the best newb Google keywords on the title. I encourage you to simply copy your answer there with a disclaimer added at the bottom, and link from this one, in case it closes.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 19 '17 at 7:04
add a comment |
1
Possible duplicate of What is the difference between the kernel space and the user space?
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 11 '17 at 8:00
1
@CiroSantilli709大抓捕六四事件法轮功 a question which was asked 7 years ago can't be a closed as a duplicate for a question asked 6 years ago. If they are really duplicates, the closure should be other way around.
– Salvador Dali
Jun 19 '17 at 5:35
1
@SalvadorDali hi, current consensus is to close by "quality": meta.stackexchange.com/questions/147643/… Since "quality" is not measurable, I just go by upvotes. ;-) Likely it comes down to which question hit the best newb Google keywords on the title. I encourage you to simply copy your answer there with a disclaimer added at the bottom, and link from this one, in case it closes.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 19 '17 at 7:04
1
1
Possible duplicate of What is the difference between the kernel space and the user space?
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 11 '17 at 8:00
Possible duplicate of What is the difference between the kernel space and the user space?
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 11 '17 at 8:00
1
1
@CiroSantilli709大抓捕六四事件法轮功 a question which was asked 7 years ago can't be a closed as a duplicate for a question asked 6 years ago. If they are really duplicates, the closure should be other way around.
– Salvador Dali
Jun 19 '17 at 5:35
@CiroSantilli709大抓捕六四事件法轮功 a question which was asked 7 years ago can't be a closed as a duplicate for a question asked 6 years ago. If they are really duplicates, the closure should be other way around.
– Salvador Dali
Jun 19 '17 at 5:35
1
1
@SalvadorDali hi, current consensus is to close by "quality": meta.stackexchange.com/questions/147643/… Since "quality" is not measurable, I just go by upvotes. ;-) Likely it comes down to which question hit the best newb Google keywords on the title. I encourage you to simply copy your answer there with a disclaimer added at the bottom, and link from this one, in case it closes.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 19 '17 at 7:04
@SalvadorDali hi, current consensus is to close by "quality": meta.stackexchange.com/questions/147643/… Since "quality" is not measurable, I just go by upvotes. ;-) Likely it comes down to which question hit the best newb Google keywords on the title. I encourage you to simply copy your answer there with a disclaimer added at the bottom, and link from this one, in case it closes.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 19 '17 at 7:04
add a comment |
7 Answers
7
active
oldest
votes
up vote
126
down vote
accepted
Kernel Mode
In Kernel mode, the executing code has complete and unrestricted
access to the underlying hardware. It
can execute any CPU instruction and
reference any memory address. Kernel
mode is generally reserved for the
lowest-level, most trusted functions
of the operating system. Crashes in
kernel mode are catastrophic; they
will halt the entire PC.
User Mode
In User mode, the executing code has no ability to directly access
hardware or reference memory. Code
running in user mode must delegate to
system APIs to access hardware or
memory. Due to the protection afforded
by this sort of isolation, crashes in
user mode are always recoverable. Most
of the code running on your computer
will execute in user mode.
Read more
Understanding User and Kernel Mode
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
2
@JackieLam : It should be in the kernel mode.
– kadina
Aug 21 '14 at 0:19
So per se, To run a user space process, it must be mapped to kernel space?
– roottraveller
Sep 9 '17 at 12:57
add a comment |
up vote
43
down vote
These are two different modes in which your computer can operate. Prior to this, when computers were like a big room, if something crashes – it halts the whole computer. So computer architects decide to change it. Modern microprocessors implement in hardware at least 2 different states.
User mode:
- mode where all user programs execute. It does not have access to RAM
and hardware. The reason for this is because if all programs ran in
kernel mode, they would be able to overwrite each other’s memory. If
it needs to access any of these features – it makes a call to the
underlying API. Each process started by windows except of system
process runs in user mode.
Kernel mode:
- mode where all kernel programs execute (different drivers). It has
access to every resource and underlying hardware. Any CPU instruction
can be executed and every memory address can be accessed. This mode
is reserved for drivers which operate on the lowest level
How the switch occurs.
The switch from user mode to kernel mode is not done automatically by CPU. CPU is interrupted by interrupts (timers, keyboard, I/O). When interrupt occurs, CPU stops executing the current running program, switch to kernel mode, executes interrupt handler. This handler saves the state of CPU, performs its operations, restore the state and returns to user mode.
http://en.wikibooks.org/wiki/Windows_Programming/User_Mode_vs_Kernel_Mode
http://tldp.org/HOWTO/KernelAnalysis-HOWTO-3.html
http://en.wikipedia.org/wiki/Direct_memory_access
http://en.wikipedia.org/wiki/Interrupt_request
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
1
@JackieLam: kernel mode
– Apurv Nerlekar
Feb 1 '16 at 23:49
add a comment |
up vote
8
down vote
A processor in a computer running Windows has two different modes: user mode and kernel mode. The processor switches between the two modes depending on what type of code is running on the processor. Applications run in user mode, and core operating system components run in kernel mode. While many drivers run in kernel mode, some drivers may run in user mode.
When you start a user-mode application, Windows creates a process for the application. The process provides the application with a private virtual address space and a private handle table. Because an application's virtual address space is private, one application cannot alter data that belongs to another application. Each application runs in isolation, and if an application crashes, the crash is limited to that one application. Other applications and the operating system are not affected by the crash.
In addition to being private, the virtual address space of a user-mode application is limited. A processor running in user mode cannot access virtual addresses that are reserved for the operating system. Limiting the virtual address space of a user-mode application prevents the application from altering, and possibly damaging, critical operating system data.
All code that runs in kernel mode shares a single virtual address space. This means that a kernel-mode driver is not isolated from other drivers and the operating system itself. If a kernel-mode driver accidentally writes to the wrong virtual address, data that belongs to the operating system or another driver could be compromised. If a kernel-mode driver crashes, the entire operating system crashes.
If you are a Windows user once go through this link you will get more.
Communication between user mode and kernel mode
add a comment |
up vote
5
down vote
I'm going to take a stab in the dark and guess you're talking about Windows. In a nutshell, kernel mode has full access to hardware, but user mode doesn't. For instance, many if not most device drivers are written in kernel mode because they need to control finer details of their hardware.
See also this wikibook.
2
This is important to you as a programmer because kernel bugs tend to wreak far worse havoc than you may be accustomed to. One reason for the kernel/user distinction is so the kernel can monitor/control critical system resources and protect each user from the others. It's a bit oversimplified, but still helpful, to remind yourself that user bugs are often annoying, but kernel bugs tend to bring the entire machine down.
– Adam Liss
Aug 21 '09 at 11:39
add a comment |
up vote
3
down vote
Other answers already explained the difference between user and kernel mode. If you really want to get into detail you should get a copy of
Windows Internals, an excellent book written by Mark Russinovich and David Solomon describing the architecture and inside details of the various Windows operating systems.
add a comment |
up vote
2
down vote
CPU rings are the most clear distinction
In x86 protected mode, the CPU is always in one of 4 rings. The Linux kernel only uses 0 and 3:
- 0 for kernel
- 3 for users
This is the most hard and fast definition of kernel vs userland.
Why Linux does not use rings 1 and 2: CPU Privilege Rings: Why rings 1 and 2 aren't used?
How is the current ring determined?
The current ring is selected by a combination of:
global descriptor table: a in-memory table of GDT entries, and each entry has a field
Privl
which encodes the ring.
The LGDT instruction sets the address to the current descriptor table.
See also: http://wiki.osdev.org/Global_Descriptor_Table
the segment registers CS, DS, etc., which point to the index of an entry in the GDT.
For example,
CS = 0
means the first entry of the GDT is currently active for the executing code.
What can each ring do?
The CPU chip is physically built so that:
ring 0 can do anything
ring 3 cannot run several instructions and write to several registers, most notably:
cannot change its own ring! Otherwise, it could set itself to ring 0 and rings would be useless.
In other words, cannot modify the current segment descriptor, which determines the current ring.
cannot modify the page tables: How does x86 paging work?
In other words, cannot modify the CR3 register, and paging itself prevents modification of the page tables.
This prevents one process from seeing the memory of other processes for security / ease of programming reasons.
cannot register interrupt handlers. Those are configured by writing to memory locations, which is also prevented by paging.
Handlers run in ring 0, and would break the security model.
In other words, cannot use the LGDT and LIDT instructions.
cannot do IO instructions like
in
andout
, and thus have arbitrary hardware accesses.
Otherwise, for example, file permissions would be useless if any program could directly read from disk.
More precisely thanks to Michael Petch: it is actually possible for the OS to allow IO instructions on ring 3, this is actually controlled by the Task state segment.
What is not possible is for ring 3 to give itself permission to do so if it didn't have it in the first place.
Linux always disallows it. See also: Why doesn't Linux use the hardware context switch via the TSS?
How do how programs and operating systems transition between rings?
when the CPU is turned on, it starts running the initial program in ring 0 (well kind of, but it is a good approximation). You can think this initial program as being the kernel (but it is normally a bootloader that then calls the kernel still in ring 0).
when an userland process wants the kernel to do something for it like write to a file, it uses an instruction that generates an interrupt such as
int 0x80
to signal the kernel.
When this happens, the CPU calls and interrupt callback handler which the kernel registered at boot time.
This handler runs in ring 0, which decides if the kernel will allow this action, do the action, and restart the userland program in ring 3.
when the
exec
system call is used (or when the kernel will start/init
), the kernel prepares the registers and memory of the new userland process, then it jumps to the entry point and switches the CPU to ring 3
If the program tries to do something naughty like write to a forbidden register or memory address (because of paging), the CPU also calls some kernel callback handler in ring 0.
But since the userland was naughty, the kernel might kill the process this time, or give it a warning with a signal.
When the kernel boots, it setups a hardware clock with some fixed frequency, which generates interrupts periodically.
This hardware clock generates interrupts that run ring 0, and allow it to schedule which userland processes to wake up.
This way, scheduling can happen even if the processes are not making any system calls.
What is the point of having multiple rings?
There are two major advantages of separating kernel and userland:
- it is easier to make programs as you are more certain one won't interfere with the other. E.g., one userland process does not have to worry about overwriting the memory of another program because of paging, nor about putting hardware in an invalid state for another process.
- it is more secure. E.g. file permissions and memory separation could prevent a hacking app from reading your bank data. This supposes, of course, that you trust the kernel.
How to play around with it?
I've created a bare metal setup that should be a good way to manipulate rings directly: https://github.com/cirosantilli/x86-bare-metal-examples
I didn't have the patience to make a userland example unfortunately, but I did go as far as paging setup, so userland should be feasible. I'd love to see a pull request.
Alternatively, Linux kernel modules run in ring 0, so you can use them to try out privileged operations, e.g. read the control registers: How to access the control registers cr0,cr2,cr3 from a program? Getting segmentation fault
Here is a convenient QEMU + Buildroot setup to try it out without killing your host.
The downside of kernel modules is that other kthreads are running and could interfere with your experiments. But in theory you can take over all interrupt handlers with your kernel module and own the system, that would be an interesting project actually.
Negative rings
While negative rings are not actually referenced in the Intel manual, there are actually CPU modes which have further capabilities than ring 0 itself, and so are a good fit for the "negative ring" name.
One example is the hypervisor mode used in virtualization.
For further details see: https://security.stackexchange.com/questions/129098/what-is-protection-ring-1
ARM
In ARM, the rings are called Exception Levels instead, but the main ideas remain the same.
There exist 4 exception levels in ARMv8, commonly used as:
EL0: userland
EL1: kernel
EL2: hypervisors, for example Xen.
A hypervisor is to an OS, what an OS is to userland.
For example, Xen allows you to run multiple OSes such as Linux or Windows on the same system at the same time, and it isolates the OSes from one another for security and ease of debug, just like Linux does for userland programs.
Hypervisors are a key part of today's cloud infrastructure: they allow multiple servers to run on a single hardware, keeping hardware usage always close to 100% and saving a lot of money.
AWS for example used Xen until 2017 when its move to KVM made the news.
EL3: yet another level. TODO example.
The ARMv8 Architecture Reference Model DDI 0487C.a - Chapter D1 - The AArch64 System Level Programmer's Model - Figure D1-1 illustrates this beautifully:
Note how ARM, maybe due to the benefit of hindsight, has a better naming convention for the privilege levels than x86, without the need for negative levels: 0 being the lower and 3 highest. Higher levels tend to be created more often than lower ones.
The current EL can be queried with the MRS
instruction: what is the current execution mode/exception level, etc?
ARM does not require all exception levels to be present to allow for implementations that don't need the feature to save chip area. ARMv8 "Exception levels" says:
An implementation might not include all of the Exception levels. All implementations must include EL0 and EL1.
EL2 and EL3 are optional.
QEMU for example defaults to EL1, but EL2 and EL3 can be enabled with command line options: qemu-system-aarch64 entering el1 when emulating a53 power up
1
Since this question isn't specific to any OS,in
andout
are available to ring 3. The TSS can point to an IO permission table in the current task granting read/write access to all or specific ports.
– Michael Petch
Feb 21 at 2:37
Of course you set the IOPL bits to the value 3 then the ring 3 program has full port access and the TSS IO permissions don't apply.
– Michael Petch
Feb 21 at 3:44
@MichaelPetch thanks, I didn't know this. I have updated the answer.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Feb 21 at 13:33
add a comment |
up vote
1
down vote
What
Basically the difference between kernel and user modes is not OS dependent and is achieved only by restricting some instructions to be run only in kernel mode by means of hardware design. All other purposes like memory protection can be done only by that restriction.
How
It means that the processor lives in either the kernel mode or in the user mode. Using some mechanisms the architecture can guarantee that whenever it is switched to the kernel mode the OS code is fetched to be run.
Why
Having this hardware infrastructure these could be achieved in common OSes:
- Protecting user programs from accessing whole the memory, to not let programs overwrite the OS for example,
- preventing user programs from performing sensitive instructions such as those that change CPU memory pointer bounds, to not let programs break their memory bounds for example.
add a comment |
7 Answers
7
active
oldest
votes
7 Answers
7
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
126
down vote
accepted
Kernel Mode
In Kernel mode, the executing code has complete and unrestricted
access to the underlying hardware. It
can execute any CPU instruction and
reference any memory address. Kernel
mode is generally reserved for the
lowest-level, most trusted functions
of the operating system. Crashes in
kernel mode are catastrophic; they
will halt the entire PC.
User Mode
In User mode, the executing code has no ability to directly access
hardware or reference memory. Code
running in user mode must delegate to
system APIs to access hardware or
memory. Due to the protection afforded
by this sort of isolation, crashes in
user mode are always recoverable. Most
of the code running on your computer
will execute in user mode.
Read more
Understanding User and Kernel Mode
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
2
@JackieLam : It should be in the kernel mode.
– kadina
Aug 21 '14 at 0:19
So per se, To run a user space process, it must be mapped to kernel space?
– roottraveller
Sep 9 '17 at 12:57
add a comment |
up vote
126
down vote
accepted
Kernel Mode
In Kernel mode, the executing code has complete and unrestricted
access to the underlying hardware. It
can execute any CPU instruction and
reference any memory address. Kernel
mode is generally reserved for the
lowest-level, most trusted functions
of the operating system. Crashes in
kernel mode are catastrophic; they
will halt the entire PC.
User Mode
In User mode, the executing code has no ability to directly access
hardware or reference memory. Code
running in user mode must delegate to
system APIs to access hardware or
memory. Due to the protection afforded
by this sort of isolation, crashes in
user mode are always recoverable. Most
of the code running on your computer
will execute in user mode.
Read more
Understanding User and Kernel Mode
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
2
@JackieLam : It should be in the kernel mode.
– kadina
Aug 21 '14 at 0:19
So per se, To run a user space process, it must be mapped to kernel space?
– roottraveller
Sep 9 '17 at 12:57
add a comment |
up vote
126
down vote
accepted
up vote
126
down vote
accepted
Kernel Mode
In Kernel mode, the executing code has complete and unrestricted
access to the underlying hardware. It
can execute any CPU instruction and
reference any memory address. Kernel
mode is generally reserved for the
lowest-level, most trusted functions
of the operating system. Crashes in
kernel mode are catastrophic; they
will halt the entire PC.
User Mode
In User mode, the executing code has no ability to directly access
hardware or reference memory. Code
running in user mode must delegate to
system APIs to access hardware or
memory. Due to the protection afforded
by this sort of isolation, crashes in
user mode are always recoverable. Most
of the code running on your computer
will execute in user mode.
Read more
Understanding User and Kernel Mode
Kernel Mode
In Kernel mode, the executing code has complete and unrestricted
access to the underlying hardware. It
can execute any CPU instruction and
reference any memory address. Kernel
mode is generally reserved for the
lowest-level, most trusted functions
of the operating system. Crashes in
kernel mode are catastrophic; they
will halt the entire PC.
User Mode
In User mode, the executing code has no ability to directly access
hardware or reference memory. Code
running in user mode must delegate to
system APIs to access hardware or
memory. Due to the protection afforded
by this sort of isolation, crashes in
user mode are always recoverable. Most
of the code running on your computer
will execute in user mode.
Read more
Understanding User and Kernel Mode
edited Mar 24 '14 at 0:07
Chris Simmons
4,00352642
4,00352642
answered Aug 21 '09 at 11:31
rahul
148k42201245
148k42201245
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
2
@JackieLam : It should be in the kernel mode.
– kadina
Aug 21 '14 at 0:19
So per se, To run a user space process, it must be mapped to kernel space?
– roottraveller
Sep 9 '17 at 12:57
add a comment |
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
2
@JackieLam : It should be in the kernel mode.
– kadina
Aug 21 '14 at 0:19
So per se, To run a user space process, it must be mapped to kernel space?
– roottraveller
Sep 9 '17 at 12:57
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
2
2
@JackieLam : It should be in the kernel mode.
– kadina
Aug 21 '14 at 0:19
@JackieLam : It should be in the kernel mode.
– kadina
Aug 21 '14 at 0:19
So per se, To run a user space process, it must be mapped to kernel space?
– roottraveller
Sep 9 '17 at 12:57
So per se, To run a user space process, it must be mapped to kernel space?
– roottraveller
Sep 9 '17 at 12:57
add a comment |
up vote
43
down vote
These are two different modes in which your computer can operate. Prior to this, when computers were like a big room, if something crashes – it halts the whole computer. So computer architects decide to change it. Modern microprocessors implement in hardware at least 2 different states.
User mode:
- mode where all user programs execute. It does not have access to RAM
and hardware. The reason for this is because if all programs ran in
kernel mode, they would be able to overwrite each other’s memory. If
it needs to access any of these features – it makes a call to the
underlying API. Each process started by windows except of system
process runs in user mode.
Kernel mode:
- mode where all kernel programs execute (different drivers). It has
access to every resource and underlying hardware. Any CPU instruction
can be executed and every memory address can be accessed. This mode
is reserved for drivers which operate on the lowest level
How the switch occurs.
The switch from user mode to kernel mode is not done automatically by CPU. CPU is interrupted by interrupts (timers, keyboard, I/O). When interrupt occurs, CPU stops executing the current running program, switch to kernel mode, executes interrupt handler. This handler saves the state of CPU, performs its operations, restore the state and returns to user mode.
http://en.wikibooks.org/wiki/Windows_Programming/User_Mode_vs_Kernel_Mode
http://tldp.org/HOWTO/KernelAnalysis-HOWTO-3.html
http://en.wikipedia.org/wiki/Direct_memory_access
http://en.wikipedia.org/wiki/Interrupt_request
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
1
@JackieLam: kernel mode
– Apurv Nerlekar
Feb 1 '16 at 23:49
add a comment |
up vote
43
down vote
These are two different modes in which your computer can operate. Prior to this, when computers were like a big room, if something crashes – it halts the whole computer. So computer architects decide to change it. Modern microprocessors implement in hardware at least 2 different states.
User mode:
- mode where all user programs execute. It does not have access to RAM
and hardware. The reason for this is because if all programs ran in
kernel mode, they would be able to overwrite each other’s memory. If
it needs to access any of these features – it makes a call to the
underlying API. Each process started by windows except of system
process runs in user mode.
Kernel mode:
- mode where all kernel programs execute (different drivers). It has
access to every resource and underlying hardware. Any CPU instruction
can be executed and every memory address can be accessed. This mode
is reserved for drivers which operate on the lowest level
How the switch occurs.
The switch from user mode to kernel mode is not done automatically by CPU. CPU is interrupted by interrupts (timers, keyboard, I/O). When interrupt occurs, CPU stops executing the current running program, switch to kernel mode, executes interrupt handler. This handler saves the state of CPU, performs its operations, restore the state and returns to user mode.
http://en.wikibooks.org/wiki/Windows_Programming/User_Mode_vs_Kernel_Mode
http://tldp.org/HOWTO/KernelAnalysis-HOWTO-3.html
http://en.wikipedia.org/wiki/Direct_memory_access
http://en.wikipedia.org/wiki/Interrupt_request
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
1
@JackieLam: kernel mode
– Apurv Nerlekar
Feb 1 '16 at 23:49
add a comment |
up vote
43
down vote
up vote
43
down vote
These are two different modes in which your computer can operate. Prior to this, when computers were like a big room, if something crashes – it halts the whole computer. So computer architects decide to change it. Modern microprocessors implement in hardware at least 2 different states.
User mode:
- mode where all user programs execute. It does not have access to RAM
and hardware. The reason for this is because if all programs ran in
kernel mode, they would be able to overwrite each other’s memory. If
it needs to access any of these features – it makes a call to the
underlying API. Each process started by windows except of system
process runs in user mode.
Kernel mode:
- mode where all kernel programs execute (different drivers). It has
access to every resource and underlying hardware. Any CPU instruction
can be executed and every memory address can be accessed. This mode
is reserved for drivers which operate on the lowest level
How the switch occurs.
The switch from user mode to kernel mode is not done automatically by CPU. CPU is interrupted by interrupts (timers, keyboard, I/O). When interrupt occurs, CPU stops executing the current running program, switch to kernel mode, executes interrupt handler. This handler saves the state of CPU, performs its operations, restore the state and returns to user mode.
http://en.wikibooks.org/wiki/Windows_Programming/User_Mode_vs_Kernel_Mode
http://tldp.org/HOWTO/KernelAnalysis-HOWTO-3.html
http://en.wikipedia.org/wiki/Direct_memory_access
http://en.wikipedia.org/wiki/Interrupt_request
These are two different modes in which your computer can operate. Prior to this, when computers were like a big room, if something crashes – it halts the whole computer. So computer architects decide to change it. Modern microprocessors implement in hardware at least 2 different states.
User mode:
- mode where all user programs execute. It does not have access to RAM
and hardware. The reason for this is because if all programs ran in
kernel mode, they would be able to overwrite each other’s memory. If
it needs to access any of these features – it makes a call to the
underlying API. Each process started by windows except of system
process runs in user mode.
Kernel mode:
- mode where all kernel programs execute (different drivers). It has
access to every resource and underlying hardware. Any CPU instruction
can be executed and every memory address can be accessed. This mode
is reserved for drivers which operate on the lowest level
How the switch occurs.
The switch from user mode to kernel mode is not done automatically by CPU. CPU is interrupted by interrupts (timers, keyboard, I/O). When interrupt occurs, CPU stops executing the current running program, switch to kernel mode, executes interrupt handler. This handler saves the state of CPU, performs its operations, restore the state and returns to user mode.
http://en.wikibooks.org/wiki/Windows_Programming/User_Mode_vs_Kernel_Mode
http://tldp.org/HOWTO/KernelAnalysis-HOWTO-3.html
http://en.wikipedia.org/wiki/Direct_memory_access
http://en.wikipedia.org/wiki/Interrupt_request
answered Oct 2 '12 at 6:14
Salvador Dali
109k78480572
109k78480572
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
1
@JackieLam: kernel mode
– Apurv Nerlekar
Feb 1 '16 at 23:49
add a comment |
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
1
@JackieLam: kernel mode
– Apurv Nerlekar
Feb 1 '16 at 23:49
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
Wonder when the CPU is running operating system code, which mode is the processor in?
– JackieLam
Jun 19 '13 at 9:00
1
1
@JackieLam: kernel mode
– Apurv Nerlekar
Feb 1 '16 at 23:49
@JackieLam: kernel mode
– Apurv Nerlekar
Feb 1 '16 at 23:49
add a comment |
up vote
8
down vote
A processor in a computer running Windows has two different modes: user mode and kernel mode. The processor switches between the two modes depending on what type of code is running on the processor. Applications run in user mode, and core operating system components run in kernel mode. While many drivers run in kernel mode, some drivers may run in user mode.
When you start a user-mode application, Windows creates a process for the application. The process provides the application with a private virtual address space and a private handle table. Because an application's virtual address space is private, one application cannot alter data that belongs to another application. Each application runs in isolation, and if an application crashes, the crash is limited to that one application. Other applications and the operating system are not affected by the crash.
In addition to being private, the virtual address space of a user-mode application is limited. A processor running in user mode cannot access virtual addresses that are reserved for the operating system. Limiting the virtual address space of a user-mode application prevents the application from altering, and possibly damaging, critical operating system data.
All code that runs in kernel mode shares a single virtual address space. This means that a kernel-mode driver is not isolated from other drivers and the operating system itself. If a kernel-mode driver accidentally writes to the wrong virtual address, data that belongs to the operating system or another driver could be compromised. If a kernel-mode driver crashes, the entire operating system crashes.
If you are a Windows user once go through this link you will get more.
Communication between user mode and kernel mode
add a comment |
up vote
8
down vote
A processor in a computer running Windows has two different modes: user mode and kernel mode. The processor switches between the two modes depending on what type of code is running on the processor. Applications run in user mode, and core operating system components run in kernel mode. While many drivers run in kernel mode, some drivers may run in user mode.
When you start a user-mode application, Windows creates a process for the application. The process provides the application with a private virtual address space and a private handle table. Because an application's virtual address space is private, one application cannot alter data that belongs to another application. Each application runs in isolation, and if an application crashes, the crash is limited to that one application. Other applications and the operating system are not affected by the crash.
In addition to being private, the virtual address space of a user-mode application is limited. A processor running in user mode cannot access virtual addresses that are reserved for the operating system. Limiting the virtual address space of a user-mode application prevents the application from altering, and possibly damaging, critical operating system data.
All code that runs in kernel mode shares a single virtual address space. This means that a kernel-mode driver is not isolated from other drivers and the operating system itself. If a kernel-mode driver accidentally writes to the wrong virtual address, data that belongs to the operating system or another driver could be compromised. If a kernel-mode driver crashes, the entire operating system crashes.
If you are a Windows user once go through this link you will get more.
Communication between user mode and kernel mode
add a comment |
up vote
8
down vote
up vote
8
down vote
A processor in a computer running Windows has two different modes: user mode and kernel mode. The processor switches between the two modes depending on what type of code is running on the processor. Applications run in user mode, and core operating system components run in kernel mode. While many drivers run in kernel mode, some drivers may run in user mode.
When you start a user-mode application, Windows creates a process for the application. The process provides the application with a private virtual address space and a private handle table. Because an application's virtual address space is private, one application cannot alter data that belongs to another application. Each application runs in isolation, and if an application crashes, the crash is limited to that one application. Other applications and the operating system are not affected by the crash.
In addition to being private, the virtual address space of a user-mode application is limited. A processor running in user mode cannot access virtual addresses that are reserved for the operating system. Limiting the virtual address space of a user-mode application prevents the application from altering, and possibly damaging, critical operating system data.
All code that runs in kernel mode shares a single virtual address space. This means that a kernel-mode driver is not isolated from other drivers and the operating system itself. If a kernel-mode driver accidentally writes to the wrong virtual address, data that belongs to the operating system or another driver could be compromised. If a kernel-mode driver crashes, the entire operating system crashes.
If you are a Windows user once go through this link you will get more.
Communication between user mode and kernel mode
A processor in a computer running Windows has two different modes: user mode and kernel mode. The processor switches between the two modes depending on what type of code is running on the processor. Applications run in user mode, and core operating system components run in kernel mode. While many drivers run in kernel mode, some drivers may run in user mode.
When you start a user-mode application, Windows creates a process for the application. The process provides the application with a private virtual address space and a private handle table. Because an application's virtual address space is private, one application cannot alter data that belongs to another application. Each application runs in isolation, and if an application crashes, the crash is limited to that one application. Other applications and the operating system are not affected by the crash.
In addition to being private, the virtual address space of a user-mode application is limited. A processor running in user mode cannot access virtual addresses that are reserved for the operating system. Limiting the virtual address space of a user-mode application prevents the application from altering, and possibly damaging, critical operating system data.
All code that runs in kernel mode shares a single virtual address space. This means that a kernel-mode driver is not isolated from other drivers and the operating system itself. If a kernel-mode driver accidentally writes to the wrong virtual address, data that belongs to the operating system or another driver could be compromised. If a kernel-mode driver crashes, the entire operating system crashes.
If you are a Windows user once go through this link you will get more.
Communication between user mode and kernel mode
edited Jun 3 '16 at 8:20
community wiki
2 revs, 2 users 67%
Sangeen
add a comment |
add a comment |
up vote
5
down vote
I'm going to take a stab in the dark and guess you're talking about Windows. In a nutshell, kernel mode has full access to hardware, but user mode doesn't. For instance, many if not most device drivers are written in kernel mode because they need to control finer details of their hardware.
See also this wikibook.
2
This is important to you as a programmer because kernel bugs tend to wreak far worse havoc than you may be accustomed to. One reason for the kernel/user distinction is so the kernel can monitor/control critical system resources and protect each user from the others. It's a bit oversimplified, but still helpful, to remind yourself that user bugs are often annoying, but kernel bugs tend to bring the entire machine down.
– Adam Liss
Aug 21 '09 at 11:39
add a comment |
up vote
5
down vote
I'm going to take a stab in the dark and guess you're talking about Windows. In a nutshell, kernel mode has full access to hardware, but user mode doesn't. For instance, many if not most device drivers are written in kernel mode because they need to control finer details of their hardware.
See also this wikibook.
2
This is important to you as a programmer because kernel bugs tend to wreak far worse havoc than you may be accustomed to. One reason for the kernel/user distinction is so the kernel can monitor/control critical system resources and protect each user from the others. It's a bit oversimplified, but still helpful, to remind yourself that user bugs are often annoying, but kernel bugs tend to bring the entire machine down.
– Adam Liss
Aug 21 '09 at 11:39
add a comment |
up vote
5
down vote
up vote
5
down vote
I'm going to take a stab in the dark and guess you're talking about Windows. In a nutshell, kernel mode has full access to hardware, but user mode doesn't. For instance, many if not most device drivers are written in kernel mode because they need to control finer details of their hardware.
See also this wikibook.
I'm going to take a stab in the dark and guess you're talking about Windows. In a nutshell, kernel mode has full access to hardware, but user mode doesn't. For instance, many if not most device drivers are written in kernel mode because they need to control finer details of their hardware.
See also this wikibook.
answered Aug 21 '09 at 11:27
Mark Rushakoff
178k29358370
178k29358370
2
This is important to you as a programmer because kernel bugs tend to wreak far worse havoc than you may be accustomed to. One reason for the kernel/user distinction is so the kernel can monitor/control critical system resources and protect each user from the others. It's a bit oversimplified, but still helpful, to remind yourself that user bugs are often annoying, but kernel bugs tend to bring the entire machine down.
– Adam Liss
Aug 21 '09 at 11:39
add a comment |
2
This is important to you as a programmer because kernel bugs tend to wreak far worse havoc than you may be accustomed to. One reason for the kernel/user distinction is so the kernel can monitor/control critical system resources and protect each user from the others. It's a bit oversimplified, but still helpful, to remind yourself that user bugs are often annoying, but kernel bugs tend to bring the entire machine down.
– Adam Liss
Aug 21 '09 at 11:39
2
2
This is important to you as a programmer because kernel bugs tend to wreak far worse havoc than you may be accustomed to. One reason for the kernel/user distinction is so the kernel can monitor/control critical system resources and protect each user from the others. It's a bit oversimplified, but still helpful, to remind yourself that user bugs are often annoying, but kernel bugs tend to bring the entire machine down.
– Adam Liss
Aug 21 '09 at 11:39
This is important to you as a programmer because kernel bugs tend to wreak far worse havoc than you may be accustomed to. One reason for the kernel/user distinction is so the kernel can monitor/control critical system resources and protect each user from the others. It's a bit oversimplified, but still helpful, to remind yourself that user bugs are often annoying, but kernel bugs tend to bring the entire machine down.
– Adam Liss
Aug 21 '09 at 11:39
add a comment |
up vote
3
down vote
Other answers already explained the difference between user and kernel mode. If you really want to get into detail you should get a copy of
Windows Internals, an excellent book written by Mark Russinovich and David Solomon describing the architecture and inside details of the various Windows operating systems.
add a comment |
up vote
3
down vote
Other answers already explained the difference between user and kernel mode. If you really want to get into detail you should get a copy of
Windows Internals, an excellent book written by Mark Russinovich and David Solomon describing the architecture and inside details of the various Windows operating systems.
add a comment |
up vote
3
down vote
up vote
3
down vote
Other answers already explained the difference between user and kernel mode. If you really want to get into detail you should get a copy of
Windows Internals, an excellent book written by Mark Russinovich and David Solomon describing the architecture and inside details of the various Windows operating systems.
Other answers already explained the difference between user and kernel mode. If you really want to get into detail you should get a copy of
Windows Internals, an excellent book written by Mark Russinovich and David Solomon describing the architecture and inside details of the various Windows operating systems.
answered Aug 21 '09 at 11:52
Dirk Vollmar
136k45222285
136k45222285
add a comment |
add a comment |
up vote
2
down vote
CPU rings are the most clear distinction
In x86 protected mode, the CPU is always in one of 4 rings. The Linux kernel only uses 0 and 3:
- 0 for kernel
- 3 for users
This is the most hard and fast definition of kernel vs userland.
Why Linux does not use rings 1 and 2: CPU Privilege Rings: Why rings 1 and 2 aren't used?
How is the current ring determined?
The current ring is selected by a combination of:
global descriptor table: a in-memory table of GDT entries, and each entry has a field
Privl
which encodes the ring.
The LGDT instruction sets the address to the current descriptor table.
See also: http://wiki.osdev.org/Global_Descriptor_Table
the segment registers CS, DS, etc., which point to the index of an entry in the GDT.
For example,
CS = 0
means the first entry of the GDT is currently active for the executing code.
What can each ring do?
The CPU chip is physically built so that:
ring 0 can do anything
ring 3 cannot run several instructions and write to several registers, most notably:
cannot change its own ring! Otherwise, it could set itself to ring 0 and rings would be useless.
In other words, cannot modify the current segment descriptor, which determines the current ring.
cannot modify the page tables: How does x86 paging work?
In other words, cannot modify the CR3 register, and paging itself prevents modification of the page tables.
This prevents one process from seeing the memory of other processes for security / ease of programming reasons.
cannot register interrupt handlers. Those are configured by writing to memory locations, which is also prevented by paging.
Handlers run in ring 0, and would break the security model.
In other words, cannot use the LGDT and LIDT instructions.
cannot do IO instructions like
in
andout
, and thus have arbitrary hardware accesses.
Otherwise, for example, file permissions would be useless if any program could directly read from disk.
More precisely thanks to Michael Petch: it is actually possible for the OS to allow IO instructions on ring 3, this is actually controlled by the Task state segment.
What is not possible is for ring 3 to give itself permission to do so if it didn't have it in the first place.
Linux always disallows it. See also: Why doesn't Linux use the hardware context switch via the TSS?
How do how programs and operating systems transition between rings?
when the CPU is turned on, it starts running the initial program in ring 0 (well kind of, but it is a good approximation). You can think this initial program as being the kernel (but it is normally a bootloader that then calls the kernel still in ring 0).
when an userland process wants the kernel to do something for it like write to a file, it uses an instruction that generates an interrupt such as
int 0x80
to signal the kernel.
When this happens, the CPU calls and interrupt callback handler which the kernel registered at boot time.
This handler runs in ring 0, which decides if the kernel will allow this action, do the action, and restart the userland program in ring 3.
when the
exec
system call is used (or when the kernel will start/init
), the kernel prepares the registers and memory of the new userland process, then it jumps to the entry point and switches the CPU to ring 3
If the program tries to do something naughty like write to a forbidden register or memory address (because of paging), the CPU also calls some kernel callback handler in ring 0.
But since the userland was naughty, the kernel might kill the process this time, or give it a warning with a signal.
When the kernel boots, it setups a hardware clock with some fixed frequency, which generates interrupts periodically.
This hardware clock generates interrupts that run ring 0, and allow it to schedule which userland processes to wake up.
This way, scheduling can happen even if the processes are not making any system calls.
What is the point of having multiple rings?
There are two major advantages of separating kernel and userland:
- it is easier to make programs as you are more certain one won't interfere with the other. E.g., one userland process does not have to worry about overwriting the memory of another program because of paging, nor about putting hardware in an invalid state for another process.
- it is more secure. E.g. file permissions and memory separation could prevent a hacking app from reading your bank data. This supposes, of course, that you trust the kernel.
How to play around with it?
I've created a bare metal setup that should be a good way to manipulate rings directly: https://github.com/cirosantilli/x86-bare-metal-examples
I didn't have the patience to make a userland example unfortunately, but I did go as far as paging setup, so userland should be feasible. I'd love to see a pull request.
Alternatively, Linux kernel modules run in ring 0, so you can use them to try out privileged operations, e.g. read the control registers: How to access the control registers cr0,cr2,cr3 from a program? Getting segmentation fault
Here is a convenient QEMU + Buildroot setup to try it out without killing your host.
The downside of kernel modules is that other kthreads are running and could interfere with your experiments. But in theory you can take over all interrupt handlers with your kernel module and own the system, that would be an interesting project actually.
Negative rings
While negative rings are not actually referenced in the Intel manual, there are actually CPU modes which have further capabilities than ring 0 itself, and so are a good fit for the "negative ring" name.
One example is the hypervisor mode used in virtualization.
For further details see: https://security.stackexchange.com/questions/129098/what-is-protection-ring-1
ARM
In ARM, the rings are called Exception Levels instead, but the main ideas remain the same.
There exist 4 exception levels in ARMv8, commonly used as:
EL0: userland
EL1: kernel
EL2: hypervisors, for example Xen.
A hypervisor is to an OS, what an OS is to userland.
For example, Xen allows you to run multiple OSes such as Linux or Windows on the same system at the same time, and it isolates the OSes from one another for security and ease of debug, just like Linux does for userland programs.
Hypervisors are a key part of today's cloud infrastructure: they allow multiple servers to run on a single hardware, keeping hardware usage always close to 100% and saving a lot of money.
AWS for example used Xen until 2017 when its move to KVM made the news.
EL3: yet another level. TODO example.
The ARMv8 Architecture Reference Model DDI 0487C.a - Chapter D1 - The AArch64 System Level Programmer's Model - Figure D1-1 illustrates this beautifully:
Note how ARM, maybe due to the benefit of hindsight, has a better naming convention for the privilege levels than x86, without the need for negative levels: 0 being the lower and 3 highest. Higher levels tend to be created more often than lower ones.
The current EL can be queried with the MRS
instruction: what is the current execution mode/exception level, etc?
ARM does not require all exception levels to be present to allow for implementations that don't need the feature to save chip area. ARMv8 "Exception levels" says:
An implementation might not include all of the Exception levels. All implementations must include EL0 and EL1.
EL2 and EL3 are optional.
QEMU for example defaults to EL1, but EL2 and EL3 can be enabled with command line options: qemu-system-aarch64 entering el1 when emulating a53 power up
1
Since this question isn't specific to any OS,in
andout
are available to ring 3. The TSS can point to an IO permission table in the current task granting read/write access to all or specific ports.
– Michael Petch
Feb 21 at 2:37
Of course you set the IOPL bits to the value 3 then the ring 3 program has full port access and the TSS IO permissions don't apply.
– Michael Petch
Feb 21 at 3:44
@MichaelPetch thanks, I didn't know this. I have updated the answer.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Feb 21 at 13:33
add a comment |
up vote
2
down vote
CPU rings are the most clear distinction
In x86 protected mode, the CPU is always in one of 4 rings. The Linux kernel only uses 0 and 3:
- 0 for kernel
- 3 for users
This is the most hard and fast definition of kernel vs userland.
Why Linux does not use rings 1 and 2: CPU Privilege Rings: Why rings 1 and 2 aren't used?
How is the current ring determined?
The current ring is selected by a combination of:
global descriptor table: a in-memory table of GDT entries, and each entry has a field
Privl
which encodes the ring.
The LGDT instruction sets the address to the current descriptor table.
See also: http://wiki.osdev.org/Global_Descriptor_Table
the segment registers CS, DS, etc., which point to the index of an entry in the GDT.
For example,
CS = 0
means the first entry of the GDT is currently active for the executing code.
What can each ring do?
The CPU chip is physically built so that:
ring 0 can do anything
ring 3 cannot run several instructions and write to several registers, most notably:
cannot change its own ring! Otherwise, it could set itself to ring 0 and rings would be useless.
In other words, cannot modify the current segment descriptor, which determines the current ring.
cannot modify the page tables: How does x86 paging work?
In other words, cannot modify the CR3 register, and paging itself prevents modification of the page tables.
This prevents one process from seeing the memory of other processes for security / ease of programming reasons.
cannot register interrupt handlers. Those are configured by writing to memory locations, which is also prevented by paging.
Handlers run in ring 0, and would break the security model.
In other words, cannot use the LGDT and LIDT instructions.
cannot do IO instructions like
in
andout
, and thus have arbitrary hardware accesses.
Otherwise, for example, file permissions would be useless if any program could directly read from disk.
More precisely thanks to Michael Petch: it is actually possible for the OS to allow IO instructions on ring 3, this is actually controlled by the Task state segment.
What is not possible is for ring 3 to give itself permission to do so if it didn't have it in the first place.
Linux always disallows it. See also: Why doesn't Linux use the hardware context switch via the TSS?
How do how programs and operating systems transition between rings?
when the CPU is turned on, it starts running the initial program in ring 0 (well kind of, but it is a good approximation). You can think this initial program as being the kernel (but it is normally a bootloader that then calls the kernel still in ring 0).
when an userland process wants the kernel to do something for it like write to a file, it uses an instruction that generates an interrupt such as
int 0x80
to signal the kernel.
When this happens, the CPU calls and interrupt callback handler which the kernel registered at boot time.
This handler runs in ring 0, which decides if the kernel will allow this action, do the action, and restart the userland program in ring 3.
when the
exec
system call is used (or when the kernel will start/init
), the kernel prepares the registers and memory of the new userland process, then it jumps to the entry point and switches the CPU to ring 3
If the program tries to do something naughty like write to a forbidden register or memory address (because of paging), the CPU also calls some kernel callback handler in ring 0.
But since the userland was naughty, the kernel might kill the process this time, or give it a warning with a signal.
When the kernel boots, it setups a hardware clock with some fixed frequency, which generates interrupts periodically.
This hardware clock generates interrupts that run ring 0, and allow it to schedule which userland processes to wake up.
This way, scheduling can happen even if the processes are not making any system calls.
What is the point of having multiple rings?
There are two major advantages of separating kernel and userland:
- it is easier to make programs as you are more certain one won't interfere with the other. E.g., one userland process does not have to worry about overwriting the memory of another program because of paging, nor about putting hardware in an invalid state for another process.
- it is more secure. E.g. file permissions and memory separation could prevent a hacking app from reading your bank data. This supposes, of course, that you trust the kernel.
How to play around with it?
I've created a bare metal setup that should be a good way to manipulate rings directly: https://github.com/cirosantilli/x86-bare-metal-examples
I didn't have the patience to make a userland example unfortunately, but I did go as far as paging setup, so userland should be feasible. I'd love to see a pull request.
Alternatively, Linux kernel modules run in ring 0, so you can use them to try out privileged operations, e.g. read the control registers: How to access the control registers cr0,cr2,cr3 from a program? Getting segmentation fault
Here is a convenient QEMU + Buildroot setup to try it out without killing your host.
The downside of kernel modules is that other kthreads are running and could interfere with your experiments. But in theory you can take over all interrupt handlers with your kernel module and own the system, that would be an interesting project actually.
Negative rings
While negative rings are not actually referenced in the Intel manual, there are actually CPU modes which have further capabilities than ring 0 itself, and so are a good fit for the "negative ring" name.
One example is the hypervisor mode used in virtualization.
For further details see: https://security.stackexchange.com/questions/129098/what-is-protection-ring-1
ARM
In ARM, the rings are called Exception Levels instead, but the main ideas remain the same.
There exist 4 exception levels in ARMv8, commonly used as:
EL0: userland
EL1: kernel
EL2: hypervisors, for example Xen.
A hypervisor is to an OS, what an OS is to userland.
For example, Xen allows you to run multiple OSes such as Linux or Windows on the same system at the same time, and it isolates the OSes from one another for security and ease of debug, just like Linux does for userland programs.
Hypervisors are a key part of today's cloud infrastructure: they allow multiple servers to run on a single hardware, keeping hardware usage always close to 100% and saving a lot of money.
AWS for example used Xen until 2017 when its move to KVM made the news.
EL3: yet another level. TODO example.
The ARMv8 Architecture Reference Model DDI 0487C.a - Chapter D1 - The AArch64 System Level Programmer's Model - Figure D1-1 illustrates this beautifully:
Note how ARM, maybe due to the benefit of hindsight, has a better naming convention for the privilege levels than x86, without the need for negative levels: 0 being the lower and 3 highest. Higher levels tend to be created more often than lower ones.
The current EL can be queried with the MRS
instruction: what is the current execution mode/exception level, etc?
ARM does not require all exception levels to be present to allow for implementations that don't need the feature to save chip area. ARMv8 "Exception levels" says:
An implementation might not include all of the Exception levels. All implementations must include EL0 and EL1.
EL2 and EL3 are optional.
QEMU for example defaults to EL1, but EL2 and EL3 can be enabled with command line options: qemu-system-aarch64 entering el1 when emulating a53 power up
1
Since this question isn't specific to any OS,in
andout
are available to ring 3. The TSS can point to an IO permission table in the current task granting read/write access to all or specific ports.
– Michael Petch
Feb 21 at 2:37
Of course you set the IOPL bits to the value 3 then the ring 3 program has full port access and the TSS IO permissions don't apply.
– Michael Petch
Feb 21 at 3:44
@MichaelPetch thanks, I didn't know this. I have updated the answer.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Feb 21 at 13:33
add a comment |
up vote
2
down vote
up vote
2
down vote
CPU rings are the most clear distinction
In x86 protected mode, the CPU is always in one of 4 rings. The Linux kernel only uses 0 and 3:
- 0 for kernel
- 3 for users
This is the most hard and fast definition of kernel vs userland.
Why Linux does not use rings 1 and 2: CPU Privilege Rings: Why rings 1 and 2 aren't used?
How is the current ring determined?
The current ring is selected by a combination of:
global descriptor table: a in-memory table of GDT entries, and each entry has a field
Privl
which encodes the ring.
The LGDT instruction sets the address to the current descriptor table.
See also: http://wiki.osdev.org/Global_Descriptor_Table
the segment registers CS, DS, etc., which point to the index of an entry in the GDT.
For example,
CS = 0
means the first entry of the GDT is currently active for the executing code.
What can each ring do?
The CPU chip is physically built so that:
ring 0 can do anything
ring 3 cannot run several instructions and write to several registers, most notably:
cannot change its own ring! Otherwise, it could set itself to ring 0 and rings would be useless.
In other words, cannot modify the current segment descriptor, which determines the current ring.
cannot modify the page tables: How does x86 paging work?
In other words, cannot modify the CR3 register, and paging itself prevents modification of the page tables.
This prevents one process from seeing the memory of other processes for security / ease of programming reasons.
cannot register interrupt handlers. Those are configured by writing to memory locations, which is also prevented by paging.
Handlers run in ring 0, and would break the security model.
In other words, cannot use the LGDT and LIDT instructions.
cannot do IO instructions like
in
andout
, and thus have arbitrary hardware accesses.
Otherwise, for example, file permissions would be useless if any program could directly read from disk.
More precisely thanks to Michael Petch: it is actually possible for the OS to allow IO instructions on ring 3, this is actually controlled by the Task state segment.
What is not possible is for ring 3 to give itself permission to do so if it didn't have it in the first place.
Linux always disallows it. See also: Why doesn't Linux use the hardware context switch via the TSS?
How do how programs and operating systems transition between rings?
when the CPU is turned on, it starts running the initial program in ring 0 (well kind of, but it is a good approximation). You can think this initial program as being the kernel (but it is normally a bootloader that then calls the kernel still in ring 0).
when an userland process wants the kernel to do something for it like write to a file, it uses an instruction that generates an interrupt such as
int 0x80
to signal the kernel.
When this happens, the CPU calls and interrupt callback handler which the kernel registered at boot time.
This handler runs in ring 0, which decides if the kernel will allow this action, do the action, and restart the userland program in ring 3.
when the
exec
system call is used (or when the kernel will start/init
), the kernel prepares the registers and memory of the new userland process, then it jumps to the entry point and switches the CPU to ring 3
If the program tries to do something naughty like write to a forbidden register or memory address (because of paging), the CPU also calls some kernel callback handler in ring 0.
But since the userland was naughty, the kernel might kill the process this time, or give it a warning with a signal.
When the kernel boots, it setups a hardware clock with some fixed frequency, which generates interrupts periodically.
This hardware clock generates interrupts that run ring 0, and allow it to schedule which userland processes to wake up.
This way, scheduling can happen even if the processes are not making any system calls.
What is the point of having multiple rings?
There are two major advantages of separating kernel and userland:
- it is easier to make programs as you are more certain one won't interfere with the other. E.g., one userland process does not have to worry about overwriting the memory of another program because of paging, nor about putting hardware in an invalid state for another process.
- it is more secure. E.g. file permissions and memory separation could prevent a hacking app from reading your bank data. This supposes, of course, that you trust the kernel.
How to play around with it?
I've created a bare metal setup that should be a good way to manipulate rings directly: https://github.com/cirosantilli/x86-bare-metal-examples
I didn't have the patience to make a userland example unfortunately, but I did go as far as paging setup, so userland should be feasible. I'd love to see a pull request.
Alternatively, Linux kernel modules run in ring 0, so you can use them to try out privileged operations, e.g. read the control registers: How to access the control registers cr0,cr2,cr3 from a program? Getting segmentation fault
Here is a convenient QEMU + Buildroot setup to try it out without killing your host.
The downside of kernel modules is that other kthreads are running and could interfere with your experiments. But in theory you can take over all interrupt handlers with your kernel module and own the system, that would be an interesting project actually.
Negative rings
While negative rings are not actually referenced in the Intel manual, there are actually CPU modes which have further capabilities than ring 0 itself, and so are a good fit for the "negative ring" name.
One example is the hypervisor mode used in virtualization.
For further details see: https://security.stackexchange.com/questions/129098/what-is-protection-ring-1
ARM
In ARM, the rings are called Exception Levels instead, but the main ideas remain the same.
There exist 4 exception levels in ARMv8, commonly used as:
EL0: userland
EL1: kernel
EL2: hypervisors, for example Xen.
A hypervisor is to an OS, what an OS is to userland.
For example, Xen allows you to run multiple OSes such as Linux or Windows on the same system at the same time, and it isolates the OSes from one another for security and ease of debug, just like Linux does for userland programs.
Hypervisors are a key part of today's cloud infrastructure: they allow multiple servers to run on a single hardware, keeping hardware usage always close to 100% and saving a lot of money.
AWS for example used Xen until 2017 when its move to KVM made the news.
EL3: yet another level. TODO example.
The ARMv8 Architecture Reference Model DDI 0487C.a - Chapter D1 - The AArch64 System Level Programmer's Model - Figure D1-1 illustrates this beautifully:
Note how ARM, maybe due to the benefit of hindsight, has a better naming convention for the privilege levels than x86, without the need for negative levels: 0 being the lower and 3 highest. Higher levels tend to be created more often than lower ones.
The current EL can be queried with the MRS
instruction: what is the current execution mode/exception level, etc?
ARM does not require all exception levels to be present to allow for implementations that don't need the feature to save chip area. ARMv8 "Exception levels" says:
An implementation might not include all of the Exception levels. All implementations must include EL0 and EL1.
EL2 and EL3 are optional.
QEMU for example defaults to EL1, but EL2 and EL3 can be enabled with command line options: qemu-system-aarch64 entering el1 when emulating a53 power up
CPU rings are the most clear distinction
In x86 protected mode, the CPU is always in one of 4 rings. The Linux kernel only uses 0 and 3:
- 0 for kernel
- 3 for users
This is the most hard and fast definition of kernel vs userland.
Why Linux does not use rings 1 and 2: CPU Privilege Rings: Why rings 1 and 2 aren't used?
How is the current ring determined?
The current ring is selected by a combination of:
global descriptor table: a in-memory table of GDT entries, and each entry has a field
Privl
which encodes the ring.
The LGDT instruction sets the address to the current descriptor table.
See also: http://wiki.osdev.org/Global_Descriptor_Table
the segment registers CS, DS, etc., which point to the index of an entry in the GDT.
For example,
CS = 0
means the first entry of the GDT is currently active for the executing code.
What can each ring do?
The CPU chip is physically built so that:
ring 0 can do anything
ring 3 cannot run several instructions and write to several registers, most notably:
cannot change its own ring! Otherwise, it could set itself to ring 0 and rings would be useless.
In other words, cannot modify the current segment descriptor, which determines the current ring.
cannot modify the page tables: How does x86 paging work?
In other words, cannot modify the CR3 register, and paging itself prevents modification of the page tables.
This prevents one process from seeing the memory of other processes for security / ease of programming reasons.
cannot register interrupt handlers. Those are configured by writing to memory locations, which is also prevented by paging.
Handlers run in ring 0, and would break the security model.
In other words, cannot use the LGDT and LIDT instructions.
cannot do IO instructions like
in
andout
, and thus have arbitrary hardware accesses.
Otherwise, for example, file permissions would be useless if any program could directly read from disk.
More precisely thanks to Michael Petch: it is actually possible for the OS to allow IO instructions on ring 3, this is actually controlled by the Task state segment.
What is not possible is for ring 3 to give itself permission to do so if it didn't have it in the first place.
Linux always disallows it. See also: Why doesn't Linux use the hardware context switch via the TSS?
How do how programs and operating systems transition between rings?
when the CPU is turned on, it starts running the initial program in ring 0 (well kind of, but it is a good approximation). You can think this initial program as being the kernel (but it is normally a bootloader that then calls the kernel still in ring 0).
when an userland process wants the kernel to do something for it like write to a file, it uses an instruction that generates an interrupt such as
int 0x80
to signal the kernel.
When this happens, the CPU calls and interrupt callback handler which the kernel registered at boot time.
This handler runs in ring 0, which decides if the kernel will allow this action, do the action, and restart the userland program in ring 3.
when the
exec
system call is used (or when the kernel will start/init
), the kernel prepares the registers and memory of the new userland process, then it jumps to the entry point and switches the CPU to ring 3
If the program tries to do something naughty like write to a forbidden register or memory address (because of paging), the CPU also calls some kernel callback handler in ring 0.
But since the userland was naughty, the kernel might kill the process this time, or give it a warning with a signal.
When the kernel boots, it setups a hardware clock with some fixed frequency, which generates interrupts periodically.
This hardware clock generates interrupts that run ring 0, and allow it to schedule which userland processes to wake up.
This way, scheduling can happen even if the processes are not making any system calls.
What is the point of having multiple rings?
There are two major advantages of separating kernel and userland:
- it is easier to make programs as you are more certain one won't interfere with the other. E.g., one userland process does not have to worry about overwriting the memory of another program because of paging, nor about putting hardware in an invalid state for another process.
- it is more secure. E.g. file permissions and memory separation could prevent a hacking app from reading your bank data. This supposes, of course, that you trust the kernel.
How to play around with it?
I've created a bare metal setup that should be a good way to manipulate rings directly: https://github.com/cirosantilli/x86-bare-metal-examples
I didn't have the patience to make a userland example unfortunately, but I did go as far as paging setup, so userland should be feasible. I'd love to see a pull request.
Alternatively, Linux kernel modules run in ring 0, so you can use them to try out privileged operations, e.g. read the control registers: How to access the control registers cr0,cr2,cr3 from a program? Getting segmentation fault
Here is a convenient QEMU + Buildroot setup to try it out without killing your host.
The downside of kernel modules is that other kthreads are running and could interfere with your experiments. But in theory you can take over all interrupt handlers with your kernel module and own the system, that would be an interesting project actually.
Negative rings
While negative rings are not actually referenced in the Intel manual, there are actually CPU modes which have further capabilities than ring 0 itself, and so are a good fit for the "negative ring" name.
One example is the hypervisor mode used in virtualization.
For further details see: https://security.stackexchange.com/questions/129098/what-is-protection-ring-1
ARM
In ARM, the rings are called Exception Levels instead, but the main ideas remain the same.
There exist 4 exception levels in ARMv8, commonly used as:
EL0: userland
EL1: kernel
EL2: hypervisors, for example Xen.
A hypervisor is to an OS, what an OS is to userland.
For example, Xen allows you to run multiple OSes such as Linux or Windows on the same system at the same time, and it isolates the OSes from one another for security and ease of debug, just like Linux does for userland programs.
Hypervisors are a key part of today's cloud infrastructure: they allow multiple servers to run on a single hardware, keeping hardware usage always close to 100% and saving a lot of money.
AWS for example used Xen until 2017 when its move to KVM made the news.
EL3: yet another level. TODO example.
The ARMv8 Architecture Reference Model DDI 0487C.a - Chapter D1 - The AArch64 System Level Programmer's Model - Figure D1-1 illustrates this beautifully:
Note how ARM, maybe due to the benefit of hindsight, has a better naming convention for the privilege levels than x86, without the need for negative levels: 0 being the lower and 3 highest. Higher levels tend to be created more often than lower ones.
The current EL can be queried with the MRS
instruction: what is the current execution mode/exception level, etc?
ARM does not require all exception levels to be present to allow for implementations that don't need the feature to save chip area. ARMv8 "Exception levels" says:
An implementation might not include all of the Exception levels. All implementations must include EL0 and EL1.
EL2 and EL3 are optional.
QEMU for example defaults to EL1, but EL2 and EL3 can be enabled with command line options: qemu-system-aarch64 entering el1 when emulating a53 power up
edited yesterday
answered Feb 16 at 15:17
Ciro Santilli 新疆改造中心 六四事件 法轮功
130k27510441
130k27510441
1
Since this question isn't specific to any OS,in
andout
are available to ring 3. The TSS can point to an IO permission table in the current task granting read/write access to all or specific ports.
– Michael Petch
Feb 21 at 2:37
Of course you set the IOPL bits to the value 3 then the ring 3 program has full port access and the TSS IO permissions don't apply.
– Michael Petch
Feb 21 at 3:44
@MichaelPetch thanks, I didn't know this. I have updated the answer.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Feb 21 at 13:33
add a comment |
1
Since this question isn't specific to any OS,in
andout
are available to ring 3. The TSS can point to an IO permission table in the current task granting read/write access to all or specific ports.
– Michael Petch
Feb 21 at 2:37
Of course you set the IOPL bits to the value 3 then the ring 3 program has full port access and the TSS IO permissions don't apply.
– Michael Petch
Feb 21 at 3:44
@MichaelPetch thanks, I didn't know this. I have updated the answer.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Feb 21 at 13:33
1
1
Since this question isn't specific to any OS,
in
and out
are available to ring 3. The TSS can point to an IO permission table in the current task granting read/write access to all or specific ports.– Michael Petch
Feb 21 at 2:37
Since this question isn't specific to any OS,
in
and out
are available to ring 3. The TSS can point to an IO permission table in the current task granting read/write access to all or specific ports.– Michael Petch
Feb 21 at 2:37
Of course you set the IOPL bits to the value 3 then the ring 3 program has full port access and the TSS IO permissions don't apply.
– Michael Petch
Feb 21 at 3:44
Of course you set the IOPL bits to the value 3 then the ring 3 program has full port access and the TSS IO permissions don't apply.
– Michael Petch
Feb 21 at 3:44
@MichaelPetch thanks, I didn't know this. I have updated the answer.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Feb 21 at 13:33
@MichaelPetch thanks, I didn't know this. I have updated the answer.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Feb 21 at 13:33
add a comment |
up vote
1
down vote
What
Basically the difference between kernel and user modes is not OS dependent and is achieved only by restricting some instructions to be run only in kernel mode by means of hardware design. All other purposes like memory protection can be done only by that restriction.
How
It means that the processor lives in either the kernel mode or in the user mode. Using some mechanisms the architecture can guarantee that whenever it is switched to the kernel mode the OS code is fetched to be run.
Why
Having this hardware infrastructure these could be achieved in common OSes:
- Protecting user programs from accessing whole the memory, to not let programs overwrite the OS for example,
- preventing user programs from performing sensitive instructions such as those that change CPU memory pointer bounds, to not let programs break their memory bounds for example.
add a comment |
up vote
1
down vote
What
Basically the difference between kernel and user modes is not OS dependent and is achieved only by restricting some instructions to be run only in kernel mode by means of hardware design. All other purposes like memory protection can be done only by that restriction.
How
It means that the processor lives in either the kernel mode or in the user mode. Using some mechanisms the architecture can guarantee that whenever it is switched to the kernel mode the OS code is fetched to be run.
Why
Having this hardware infrastructure these could be achieved in common OSes:
- Protecting user programs from accessing whole the memory, to not let programs overwrite the OS for example,
- preventing user programs from performing sensitive instructions such as those that change CPU memory pointer bounds, to not let programs break their memory bounds for example.
add a comment |
up vote
1
down vote
up vote
1
down vote
What
Basically the difference between kernel and user modes is not OS dependent and is achieved only by restricting some instructions to be run only in kernel mode by means of hardware design. All other purposes like memory protection can be done only by that restriction.
How
It means that the processor lives in either the kernel mode or in the user mode. Using some mechanisms the architecture can guarantee that whenever it is switched to the kernel mode the OS code is fetched to be run.
Why
Having this hardware infrastructure these could be achieved in common OSes:
- Protecting user programs from accessing whole the memory, to not let programs overwrite the OS for example,
- preventing user programs from performing sensitive instructions such as those that change CPU memory pointer bounds, to not let programs break their memory bounds for example.
What
Basically the difference between kernel and user modes is not OS dependent and is achieved only by restricting some instructions to be run only in kernel mode by means of hardware design. All other purposes like memory protection can be done only by that restriction.
How
It means that the processor lives in either the kernel mode or in the user mode. Using some mechanisms the architecture can guarantee that whenever it is switched to the kernel mode the OS code is fetched to be run.
Why
Having this hardware infrastructure these could be achieved in common OSes:
- Protecting user programs from accessing whole the memory, to not let programs overwrite the OS for example,
- preventing user programs from performing sensitive instructions such as those that change CPU memory pointer bounds, to not let programs break their memory bounds for example.
answered Oct 26 '17 at 20:45
Ali Asgari
949
949
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f1311402%2fwhat-is-the-difference-between-user-and-kernel-modes-in-operating-systems%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Possible duplicate of What is the difference between the kernel space and the user space?
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 11 '17 at 8:00
1
@CiroSantilli709大抓捕六四事件法轮功 a question which was asked 7 years ago can't be a closed as a duplicate for a question asked 6 years ago. If they are really duplicates, the closure should be other way around.
– Salvador Dali
Jun 19 '17 at 5:35
1
@SalvadorDali hi, current consensus is to close by "quality": meta.stackexchange.com/questions/147643/… Since "quality" is not measurable, I just go by upvotes. ;-) Likely it comes down to which question hit the best newb Google keywords on the title. I encourage you to simply copy your answer there with a disclaimer added at the bottom, and link from this one, in case it closes.
– Ciro Santilli 新疆改造中心 六四事件 法轮功
Jun 19 '17 at 7:04