DragonFly BSD

CVS log for src/sys/i386/i386/Attic/bcopy.s

[BACK] Up to [DragonFly] / src / sys / i386 / i386

Request diff between arbitrary revisions


Keyword substitution: kv
Default branch: MAIN


Revision 1.9
Sun Oct 22 16:09:21 2006 UTC (7 years, 9 months ago) by dillon
Branches: MAIN
CVS tags: HEAD
FILE REMOVED
Changes since revision 1.8: +1 -1 lines
Reorganize the way machine architectures are handled.  Consolidate the
kernel configurations into a single generic directory.  Move machine-specific
Makefile's and loader scripts into the appropriate architecture directory.

Kernel and module builds also generally add sys/arch to the include path so
source files that include architecture-specific headers do not have to
be adjusted.

sys/<ARCH>		-> sys/arch/<ARCH>
sys/conf/*.<ARCH>	-> sys/arch/<ARCH>/conf/*.<ARCH>
sys/<ARCH>/conf/<KERNEL> -> sys/config/<KERNEL>

Revision 1.8: download - view: text, markup, annotated - select for diffs
Sat Jun 10 18:07:05 2006 UTC (8 years, 1 month ago) by dillon
Branches: MAIN
CVS tags: DragonFly_RELEASE_1_6_Slip, DragonFly_RELEASE_1_6
Diff to: previous 1.7: preferred, unified
Changes since revision 1.7: +8 -4 lines
We shouldn't have to fninit to make the FP unit usable for MMX based copies.
fnclex should be sufficient.

Reported-by: "Attilio Rao" <attilio@freebsd.org>
Info-originally-from: Bruce Evans

Revision 1.7: download - view: text, markup, annotated - select for diffs
Mon Jun 20 17:43:37 2005 UTC (9 years, 1 month ago) by dillon
Branches: MAIN
CVS tags: DragonFly_RELEASE_1_4_Slip, DragonFly_RELEASE_1_4
Diff to: previous 1.6: preferred, unified
Changes since revision 1.6: +19 -0 lines
Introduce an ultra-simple, non-overlapping, int-aligned bcopy called bcopyi().

Revision 1.6: download - view: text, markup, annotated - select for diffs
Fri Jul 16 05:48:29 2004 UTC (10 years ago) by dillon
Branches: MAIN
CVS tags: DragonFly_Stable, DragonFly_Snap29Sep2004, DragonFly_Snap13Sep2004, DragonFly_RELEASE_1_2_Slip, DragonFly_RELEASE_1_2
Diff to: previous 1.5: preferred, unified
Changes since revision 1.5: +25 -17 lines
Update all my personal copyrights to the Dragonfly Standard Copyright.

Revision 1.5: download - view: text, markup, annotated - select for diffs
Wed May 5 19:26:38 2004 UTC (10 years, 2 months ago) by dillon
Branches: MAIN
CVS tags: DragonFly_1_0_REL, DragonFly_1_0_RC1, DragonFly_1_0A_REL
Diff to: previous 1.4: preferred, unified
Changes since revision 1.4: +102 -72 lines
Another major mmx/xmm/FP commit.  This is a combination of several patches
but since the earlier patches didn't actually fix the crashing and corruption
issues we were seeing everything has been rolled into one well tested commit.

Make the FP more deterministic by requiring that npxthread and the FP state
be properly synchronized, and that the FP be in a 'safe' state (meaning
that mmx/xmm registers be useable) when npxthread is NULL.  Allow the FP
save area to be revectored.  Kernel entities which use the FP unit,
such as the bcopy code, must save the app state if it hasn't already been
saved, then revector the save area.

Note that combinations of operations must be protected by a critical section
or interrupt disablement.  Any clearing or setting npxthread combined with
an fxsave/fnsave/frstor/fxrstor/fninit must be protected as an atomic entity.
Since interrupts are threads and can preempt, such preemption will cause
a thread switch to occur and thus cause npxthread and the FP state to be
manipulated.  The kernel can only depend on the FP state being stable for its
use after it has revectored the FP save area.

This commit fixes a number of issues, including potential filesystem
corruption and kernel crashes.

Revision 1.4: download - view: text, markup, annotated - select for diffs
Sat May 1 03:38:36 2004 UTC (10 years, 2 months ago) by dillon
Branches: MAIN
Diff to: previous 1.3: preferred, unified
Changes since revision 1.3: +39 -1 lines
Add bcopyb() back in for the PCVT driver.  bcopyb() is explicitly
byte-granular for the few (one?) memory mapped device which cannot always
handle 16 or 32 bit ops.

Reported-by: David Rhodus

Revision 1.3: download - view: text, markup, annotated - select for diffs
Fri Apr 30 02:59:14 2004 UTC (10 years, 2 months ago) by dillon
Branches: MAIN
Diff to: previous 1.2: preferred, unified
Changes since revision 1.2: +23 -2 lines
Fix a race in the FP copy code.  If we setup our temporary FP save area
before we check npxthread it is possible for a one-instruction-window
interrupt to come along and save the application FP state to our temporary
area and then clear npxthread, causing the application FP state to be thrown
away.

Also, if there is no app FP state (npxthread is NULL), it is possible
once we set npxthread=curthread for an interrupt to come along and save
bogus FP state to our temporary save area before we have a chance to
fninit (one instruction window since we clts just prior to the fninit),
causing the fninit to fault and npxdna to restore the bogus state.

Use a critical section to prevent these cases from occuring.

Revision 1.2: download - view: text, markup, annotated - select for diffs
Fri Apr 30 00:59:52 2004 UTC (10 years, 2 months ago) by dillon
Branches: MAIN
Diff to: previous 1.1: preferred, unified
Changes since revision 1.1: +35 -50 lines
Correct a bug in the last FPU optimized bcopy commit.  The user FPU state
was being corrupted by interrupts.

Fix the bug by implementing a feature described as a missif in the original
FreeBSD comments... add a pointer to the FP saved state in the thread
structure so routines which 'borrow' the FP unit can simply revector the
pointer temporarily to avoid corruption of the original user FP state.

The MMX_*_BLOCK macros in bcopy.s have also been simplified somewhat.  We
can simplify them even more (in the future) by reserving FPU save space in
the per-cpu structure instead of on the stack.

Revision 1.1: download - view: text, markup, annotated - select for diffs
Thu Apr 29 17:24:58 2004 UTC (10 years, 2 months ago) by dillon
Branches: MAIN
Rewrite the optimized memcpy/bcopy/bzero support subsystem.  Rip out the
old FreeBSD code almost entirely.

* Add support for stacked ONFAULT routines, allowing copyin and copyout to
  call the general memcpy entry point instead of rolling their own.

* Split memcpy/bcopy and bzero into their own files

* Add support for XMM (128 bit) and MMX (64 bit) media instruction copies

* Rewrite the integer code.  Also note that most of the previous integer
  and FP special case support had been ripped out of DragonFly long ago
  in that the assembly was no longer being referenced.  It doesn't make
  sense to have a dozen different zeroing/copying routines so focus on
  the ones that work well with recent (last ~5 years) cpus.

* Rewrite the FP state handling code.  Instead of restoring the FP state
  let it hang, which allows userland to make multiple syscalls and/or for
  the system to make multiple bcopy()/memcpy() calls without having to
  save/restore the FP state on each call.  Userland will take a fault when
  it needs the FP again.

  Note that FP optimized copies only occur for block sizes >= 2048 bytes,
  so this is not something that userland, or the kernel, will trip up on
  every time it tries to do a bcopy().

* LWKT threads need to be able to save the FP state, add the simple
  conditional and 5 lines of assembly required to do that.

AMD Athlon notes: 64 bit media instructions will get us 90% of the way
there.  It is possible to squeeze out slightly more memory bandwidth from
the 128 bit XMM instructions (SSE2).  While it does not exist in this commit
there are two additional features that can be used:  prefetching and
non-temporal writes.  Prefetching is a 3dNOW instruction and can squeeze
out significant additionaL performance if you fetch ~128 bytes ahead of
the game, but I believe it is AMD-only.  Non-temporal writes can double
UNCACHED memory bandwidth, but they have a horrible effect on L1/L2
performance and you can't mix non-temporal writes with normal writes without
completely destroying memory performance (e.g. multiple GB/s -> less then
100 MBytes/sec).

Neither prefetching nor non-temporal writes are implemented in this commit.

Diff request

This form allows you to request diffs between any two revisions of a file. You may select a symbolic revision name using the selection box or you may type in a numeric name using the type-in text box.

Log view options