DragonFly BSD
DragonFly kernel List (threaded) for 2013-04
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

[GSOC] Implement hardware nested page table support for vkernels


From: Mihai Carabas <mihai.carabas@xxxxxxxxx>
Date: Mon, 22 Apr 2013 00:42:24 +0300

--089e01183112f03ea204dae5d393
Content-Type: text/plain; charset=ISO-8859-1

Hello,

My name is Mihai Carabas and I am a second year student at a master program
from Politehnica University of Bucharest, Romania, Computer Science and
Engineering Department.

I was envolved last year in the GSoC program with the DragonFLY BSD
scheduler ("SMT/HT awareness to DragonFlyBSD scheduler"). The ones who
aren't familiar with what I had accomplished last year, here [1] is a
summary with the work and results. Meanwhile Matthew did a refactoring to
the scheduler and together with another improvements, the results got much
better.

More about me you can find on the last year proposal [2]. In the past half
a year I worked on virtualizing Android on top of the L4 microkernel. The
goal is to have two Androids running on a Galaxy Nexus. My role in this
project was to port (virtualize) a flavour of linux kernel 3.0.8 from
Samsung (tuna/maguro) on top of the microkernel. Most problems came from
the fact that were three layer of addresses (physical addresses used by the
microkernel, microkernel virtual addresses used by the kernel and linux
virtual addresses used by the user space linux programs). The linux kernel
was running in a virtual space (the microkernel address space) and one type
of problem was when allocating "physical" memory for devices (ex. GPU) and
passing the device the address to read/write from there. We had to take
care to pass the actual physical address, not the virtual one, otherwise we
would get and interconnect error which is hard to trace.

After browsing through all this year's projects and due to the fact that I
have been working a lot with memory mapping and address translations, I
would like to work this year on the "Implementing hardware nested page
table support for vkernels".

Before I began I must get a strong understading of how the current virtual
page table is implemented starting from the vmspace implementation [3].
Another point where to see how the vmspace is working/implemented is the
"Page Faults" section from the "Virtual Kernel Peek" [4]. Another point to
look is the Matthew Dillon's article from here [1]. It's worth mentioning
that last year I worked a little with the vkernels: I implemented the CPU
topology for them (basicly you can create any kind of topology you want).

I started documenting on how nested page tables work. It's good to mention
that both AMD and Intel have implemented this virtualization extension but
under different names: NPT (nested page tables) for AMD and EPT(extended
page tables) for Intel. As far as I have read, they differ in some
important details (for example EPT doesn't support accessed/dirty bits - I
have to see how this would influence the implementation).

A brief description of how NPT works can be found in the System Programming
manual from AMD [5] at page 491. Basicly, instead of using only the CR3
register which indicates the place where the page tables are, there are two
registers: gCR3 (guest CR3) which points to the guest page tables (mapping
the virtual guest pages with the physical guest ones) and nCR3 (nested CR3)
which points to the host page table (mapping the guest physical pages with
the physical memory). In the TLB are kept the direct mappings (guest
virtual pages with physical memory pages and the guests can be
differentiated by an Address Space ID - ASID). A more extended description
can be found in this paper published by AMD [7].

My plan is as following:

1) Create a mechanism to detect what virtualization extension supports the
CPU. For this I can use the cpuid instruction. For example for AMD: check
the CPUID Fn8000_0001_ECX[SVM] to see if supports virtualization at all and
if so, check the CPUID Fn8000_000A_EDX[NP] to see if the NPT extension is
available [8].

2) While exposing the info's discovered at 1), document on the flow of
calls regarding the virtual memory allocation/creation when a vkernel
starts (and more when a process in the vkernel is created).

3) Documenting on how the virtual page tables are walked through and
propose a design for a hardware implementation using NPT/EPT.

4) Peek a platform (probabl an Intel core-i3) and write a stub
implementation for activating/using the virtualization extension (this can
be done by looking at the normal implementation with one CR3 register).
Here we have to take care to leave the current implementation if no NPT/EPT
is present.

5) After having a stable vkernel with the NPT/EPT begin testing to see what
is the gain with the virtualization extension enabled. Here we must peak
programs that are allocating/freeing and accesing a lot of pages to
invalidate mappings and create new ones. This way we force of page table
walks.

6) Extending the implementation on all platforms (32/64 - AMD/Intel).

I would be glad to know your opinion on the above.

Best regards,
Mihai

[1] http://lists.dragonflybsd.org/pipermail/kernel/2012-August/015478.html
[2] http://leaf.dragonflybsd.org/mailarchive/kernel/2012-03/msg00066.html
[3]
http://gitweb.dragonflybsd.org/dragonfly.git/blob/af57bdcfa1fd2e47461fc0f5be53cb2626b0a41a:/sys/vm/vm_vmspace.c
[4] http://www.dragonflybsd.org/docs/developer/VirtualKernelPeek/
[5] http://support.amd.com/us/Embedded_TechDocs/24593.pdf
[6] http://www.freebsd.org/doc/en/articles/vm-design/
[7]
http://developer.amd.com/wordpress/media/2012/10/NPT-WP-1%201-final-TM.pdf
[8] http://support.amd.com/us/Embedded_TechDocs/25481.pdf

--089e01183112f03ea204dae5d393
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Hello,</div><div><br></div><div>My name is Mihai Cara=
bas and I am a second year student at a master program from Politehnica Uni=
versity of Bucharest, Romania, Computer Science and Engineering Department.=
=A0</div>
<div><br></div><div>I was envolved last year in the GSoC program with the D=
ragonFLY BSD scheduler (&quot;SMT/HT awareness to DragonFlyBSD scheduler&qu=
ot;). The ones who aren&#39;t familiar with what I had accomplished last ye=
ar, here [1] is a summary with the work and results. Meanwhile Matthew did =
a refactoring to the scheduler and together with another improvements, the =
results got much better.</div>
<div><br></div><div>More about me you can find on the last year proposal [2=
]. In the past half a year I worked on virtualizing Android on top of the L=
4 microkernel. The goal is to have two Androids running on a Galaxy Nexus. =
My role in this project was to port (virtualize) a flavour of linux kernel =
3.0.8 from Samsung (tuna/maguro) on top of the microkernel. Most problems c=
ame from the fact that were three layer of addresses (physical addresses us=
ed by the microkernel, microkernel virtual addresses used by the kernel and=
 linux virtual addresses used by the user space linux programs). The linux =
kernel was running in a virtual space (the microkernel address space) and o=
ne type of problem was when allocating &quot;physical&quot; memory for devi=
ces (ex. GPU) and passing the device the address to read/write from there. =
We had to take care to pass the actual physical address, not the virtual on=
e, otherwise we would get and interconnect error which is hard to trace.</d=
iv>
<div><br></div><div>After browsing through all this year&#39;s projects and=
 due to the fact that I have been working a lot with memory mapping and add=
ress translations, I would like to work this year on the &quot;Implementing=
 hardware nested page table support for vkernels&quot;.</div>
<div><br></div><div>Before I began I must get a strong understading of how =
the current virtual page table is implemented starting from the vmspace imp=
lementation [3]. Another point where to see how the vmspace is working/impl=
emented is the &quot;Page Faults&quot; section from the &quot;Virtual Kerne=
l Peek&quot; [4]. Another point to look is the Matthew Dillon&#39;s article=
 from here [1]. It&#39;s worth mentioning that last year I worked a little =
with the vkernels: I implemented the CPU topology for them (basicly you can=
 create any kind of topology you want).</div>
<div><br></div><div>I started documenting on how nested page tables work. I=
t&#39;s good to mention that both AMD and Intel have implemented this virtu=
alization extension but under different names: NPT (nested page tables) for=
 AMD and EPT(extended page tables) for Intel. As far as I have read, they d=
iffer in some important details (for example EPT doesn&#39;t support access=
ed/dirty bits - I have to see how this would influence the implementation).=
</div>
<div><br></div><div>A brief description of how NPT works can be found in th=
e System Programming manual from AMD [5] at page 491. Basicly, instead of u=
sing only the CR3 register which indicates the place where the page tables =
are, there are two registers: gCR3 (guest CR3) which points to the guest pa=
ge tables (mapping the virtual guest pages with the physical guest ones) an=
d nCR3 (nested CR3) which points to the host page table (mapping the guest =
physical pages with the physical memory). In the TLB are kept the direct ma=
ppings (guest virtual pages with physical memory pages and the guests can b=
e differentiated by an Address Space ID - ASID). A more extended descriptio=
n can be found in this paper published by AMD [7].</div>
<div><br></div><div>My plan is as following:</div><div><br></div><div>1) Cr=
eate a mechanism to detect what virtualization extension supports the CPU. =
For this I can use the cpuid instruction. For example for AMD: check the CP=
UID Fn8000_0001_ECX[SVM] to see if supports virtualization at all and if so=
, check the CPUID Fn8000_000A_EDX[NP] to see if the NPT extension is availa=
ble [8].</div>
<div><br></div><div>2) While exposing the info&#39;s discovered at 1), docu=
ment on the flow of calls regarding the virtual memory allocation/creation =
when a vkernel starts (and more when a process in the vkernel is created).<=
/div>
<div><br></div><div>3) Documenting on how the virtual page tables are walke=
d through and propose a design for a hardware implementation using NPT/EPT.=
</div><div><br></div><div>4) Peek a platform (probabl an Intel core-i3) and=
 write a stub implementation for activating/using the virtualization extens=
ion (this can be done by looking at the normal implementation with one CR3 =
register). Here we have to take care to leave the current implementation if=
 no NPT/EPT is present.</div>
<div><br></div><div>5) After having a stable vkernel with the NPT/EPT begin=
 testing to see what is the gain with the virtualization extension enabled.=
 Here we must peak programs that are allocating/freeing and accesing a lot =
of pages to invalidate mappings and create new ones. This way we force of p=
age table walks.</div>
<div><br></div><div>6) Extending the implementation on all platforms (32/64=
 - AMD/Intel).</div><div><br></div><div>I would be glad to know your opinio=
n on the above.</div><div><br></div><div>Best regards,</div><div>Mihai</div=
>
<div><br></div><div>[1] <a href=3D"http://lists.dragonflybsd.org/pipermail/=
kernel/2012-August/015478.html">http://lists.dragonflybsd.org/pipermail/ker=
nel/2012-August/015478.html</a></div><div>[2] <a href=3D"http://leaf.dragon=
flybsd.org/mailarchive/kernel/2012-03/msg00066.html">http://leaf.dragonflyb=
sd.org/mailarchive/kernel/2012-03/msg00066.html</a></div>
<div>[3] <a href=3D"http://gitweb.dragonflybsd.org/dragonfly.git/blob/af57b=
dcfa1fd2e47461fc0f5be53cb2626b0a41a:/sys/vm/vm_vmspace.c">http://gitweb.dra=
gonflybsd.org/dragonfly.git/blob/af57bdcfa1fd2e47461fc0f5be53cb2626b0a41a:/=
sys/vm/vm_vmspace.c</a></div>
<div>[4] <a href=3D"http://www.dragonflybsd.org/docs/developer/VirtualKerne=
lPeek/">http://www.dragonflybsd.org/docs/developer/VirtualKernelPeek/</a></=
div><div>[5] <a href=3D"http://support.amd.com/us/Embedded_TechDocs/24593.p=
df">http://support.amd.com/us/Embedded_TechDocs/24593.pdf</a></div>
<div>[6] <a href=3D"http://www.freebsd.org/doc/en/articles/vm-design/";>http=
://www.freebsd.org/doc/en/articles/vm-design/</a></div><div>[7] <a href=3D"=
http://developer.amd.com/wordpress/media/2012/10/NPT-WP-1%201-final-TM.pdf"=
>http://developer.amd.com/wordpress/media/2012/10/NPT-WP-1%201-final-TM.pdf=
</a></div>
<div>[8] <a href=3D"http://support.amd.com/us/Embedded_TechDocs/25481.pdf";>=
http://support.amd.com/us/Embedded_TechDocs/25481.pdf</a></div></div>

--089e01183112f03ea204dae5d393--



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]