Mombu the Programming Forum sponsored links

Go Back   Mombu the Programming Forum > Programming > MASM and TASM
User Name
Password
REGISTER NOW! Mark Forums Read

sponsored links


Reply
 
1 31st March 01:32
manoj paul joseph
External User
 
Posts: 1
Default MASM and TASM



Hi Beth,
[Warning: Newbie ]
Do you mean to say that Linux uses interrupts and not call gates to make
system calls?
If so, can you explain why? Is that faster?

Regards,
Manoj
  Reply With Quote


  sponsored links


2 31st March 01:32
tim roberts
External User
 
Posts: 1
Default MASM and TASM



Yes, and so do both Windows 9X and Windows NT.


Doesn't really matter. Any way you do a ring transition, it takes a
boatload of cycles.
--
- Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.
  Reply With Quote
3 31st March 01:32
michael brown
External User
 
Posts: 1
Default MASM and TASM


IIRC, using sysenter/sy***it (or whatever AMD and Intel are calling them) is
actually *very* fast, in the order of 150 cycles or so on an Athlon XP. This
is compared to somewhere between 1500-2000 for call gates/interrupts. I also
thought Linux was using sysenter/sy***it nowadays, but I'm quite possibly
mistaken here.

--
Michael Brown
http://www.emboss.co.nz : OOS/RSI software and more
Add michael@ to emboss.co.nz - My inbox is always open
  Reply With Quote
4 31st March 01:33
beth
External User
 
Posts: 1
Default MASM and TASM


Go check out http://linuxassembly.org to get more detailed
information...but, basically, Linux actually makes its system calls
available via INT 80h, using registers for passing parameters...EAX
takes the "function number" (similar to how AH is used under DOS INT
21h and then (left to right) ebx, ecx, edx, esi, edi and ebp take
the parameters, in a sort of "fastcall" style (ebp is a more recent
addition which wasn't in the older kernels)...
If there's not enough registers for all the parameters ( > 5 or 6
arguments), then the format is slightly different to cope with this
situation...it then changes over to the parameters being stored in
memory and only ebx (in addition to eax, of course, for the function
number is used to provide an address pointer to find the parameters
in memory...

[ This scheme is actually not greatly different to what I was
proposing for my own OS system call design...except that I'd throw out
the EAX "function number" business (on the other end, this mandates
some sort of "switch" statement to sort out between all the different
functions possible and, well, that's an unnecessary waste of time, if,
instead, each function is separated out and the application itself
just calls the right function and change the interrupt to direct
CALL instructions...unlike Linux's above scheme, this would require
some linking like Windows has in order to get the kernel API addresses
in a "portable" way that allows them to be moved around for different
versions...it makes a bit more sense to be dealing with this awkward
nonsense _once at compile time_ with some linking stuff than to do it
Linux's INT 80h way...which actually means that the Kernel can be used
_without_ linking to anything, if you use the INT 80h interface...but,
well, for that minor convenience, it uses an interrupt service rather
than a CALL...and that's a more expensive way because it's
"relocating" on-the-fly, even if slightly more convenient to use than
the "import library" nonsense of Windows (although, import libraries
and that whole linking process could be made infinitely more friendly
... ]

When you call some system call using a C function interface (using
stuff from libc or whatever then, in fact, that's not the real
system call but just a "portable" wrapper over an INT 80h...

Why is it done like this? Good question...ask Linus ...

No, seriously, there's a number of reasons for a scheme like
this...using registers suggests "faster" is amongst them...although,
with Linux, that INT instruction's going to be a much bigger
performance hit than a direct CALL, Windows-style...but, then again,
the INT has "portability" to prefer it over CALL...a CALL has to be
relocated by the loader or linked up with a specific-version library
of kernel addresses...well, an INT is more useful here from the
"portability" angle because you can just designate a INT number (Linux
= 80h and DOS = 21h, Video BIOS = 10h for old DOS programming, which
have the same basic "relocatable" reason for using INTs too ...and
then it's just a case of the kernel loading the right addresses into
the interrupt table instead...not requiring any linking support on the
application side...because, in a sense, the "relocation" is done
on-the-fly using the mechanics of an interrupt call itself ...

The actual usual reason for this interface approach is to properly
decouple a kernel interface from some arbitrary language...a different
set of "wrappers" designed for Pascal could also be created, for
instance...assembly code can by-pass these wrappers and work direct
with the kernel (slightly faster interface without any mandatory HLL
use of the stack or anything...although, it does have to be stressed
that making a software interrupt, user -> supervisor transitions and
the "switch" to locate the "function number" and the API itself will
be consuming far, far more time than simple PUSH / POP of parameters
that it is only _fractionally_ faster and smaller - a few memory bytes
and few instructions shorter - and you won't find your application
suddenly becoming ten times faster or anything...but it _is_ shorter
and faster so if every cycle and bytes counts, it _is_ a case of being
better...just, well, don't expect miracles of performance because of
it ...

[ In fact, the post I made before about OS architecture was basically
talking about taking the best of Linux _and_ Windows
designs...following Linux when it comes to using registers and then
using "wrappers" to cater for HLLs...but following Windows when it
comes to not using INTs but making direct CALLs to the API (and
avoiding that unnecessary "switch" statement working out which
function you want because they are separated out into individual API,
anyway, so the CALL itself selects which API you want...which makes
better sense as the application already knows which API it needs, so
it makes more sense to offer multiple "entry points", one for each
API, and let the application CALL directly to them rather than having
just one entry point into the kernel and then wasting run-time sorting
out which "function number" is required)...that is, basically, both
Windows and Linux have good ideas but, at the same time, have dumb
ideas too...so, I was just plucking out all the good ideas and banging
them together, trying to avoid any of the bad ideas...oh, and I added
the "caller preserves registers" angle because the philosophical angle
here was to eliminate _all redundency_ for the low-level API
usage...and, well, HLLs can simply place "wrappers" around such CALLs
to get all their HLL stuff (without making those of non-HLLs or
different HLLs - as Pascal convention isn't the same as C - suffer HLL
bullcrap that their chosen language doesn't actually need at all
... ]

It isn't totally about "being faster" or Linus could have made
improvements to go a little faster again...it's more about remembering
to look at it from the OS developer's point of view...the reverse
angle...you effectively start with nothing and then _add_ API and
_add_ HLL stuff and so on...is should we really call this "being
faster" or more a case of "why add something redundent that just makes
it slower"?

This is the ASM point of view too, which is why ASM programs tend to
be smaller and faster than their HLL counterparts when you adopt this
attitude...I share this point of view...it's not "faster",
really...it's "there's no good reason to add something that serves no
useful purpose other than to slow it all down and bloat it"...wrappers
can deal with "C compatibility" perfectly well...it _serves no useful
purpose_ to hard-wire such compatibility into a kernel
itself...another example of "portability for portability's sake"
because, at this low-level and where "wrappers" are perfectly capable
of providing any HLL functionality required, it _serves no useful
purpose_ to hard-wire this stuff into an OS...it basically only ends
up being otherwise because these OSes are coded using C and the
authors can't be bothered (though, note, Linus uses C but he doesn't
let that get in the way of good design ...

I see little excuse, really, anyway...as a C compiler could easily be
modified to include this special "fastcall" implementation as an extra
one of the calling conventions it supports...hey presto, exactly as
easy to code thereafter without making all software suffer for your
convenience...there's, of course, nothing sacred in the C convention
or Pascal convention or some built-in "fastcall" convention (which,
for example, many compilers support in their own proprietary way...I'm
NOT talking about anything without precedent...so, why not simply add
a "fastcall" convention to the compiler which matches that used by the
kernel on the OS it is targetted for? Now there's something _really
useful_ that compilers could add for "portability"...a "_OS" calling
convention which portably refers to the native OS's calling
convention...then you can specify that in your C code without actually
needing to know what that convention actually is...this would actually
be _more_ "portable" than having to just "know" that Windows uses
"stdcall" and UNIXen prefer "C" conventions and such like...if HLL
people like abstraction then abstract these facts away...there is a
"_OS" calling convention provided which is appropriate to the native
OS that version targets...what it actually is, is irrelevent because
we've "abstracted" it away...I mean, this is what so annoys me about
some of that HLL arrogance about "abstraction"...it's not just the
ludicrous claims that abstraction could bring about World Peace...it's
also the fact that they don't abstract when they should in many cases,
anyway, and then abstract something they actually shouldn't be
abstracting in other cases...there's one thing worse than an obsessed
evangelist: an obsessive evangelist that actually has no idea what it
is they are really advocating ... ]

Beth
  Reply With Quote
5 31st March 01:33
hp
External User
 
Posts: 1
Default MASM and TASM


linux syscalls "sorting out" by:
call *SYMBOL_NAME(sys_call_table)(,%eax,4)
re lx source file 'entry.S'; should be fairly fast code. hp --

Linux,Assembly,Forth: http://www.lxhp.in-berlin.de/index-lx.shtml
  Reply With Quote
6 9th April 05:56
beth
External User
 
Posts: 1
Default MASM and TASM


Yes, but - dare I use the cliche'? - "the fastest code is the code
that never runs"...if separated out then the application could just
call the API directly and dive straight into the appropriate code
without any such device at all...however well-optimised, "some code"
isn't going to beat "no code"...

Anyway, is it quite this short, sweet and simple? What would happen,
for example, if I put some random large value into EAX and made the
INT 80h call? Clearly, this table of pointers to the API couldn't
literally be 16GB (32-bits; 4GB * 4 bytes per DWORD pointer) in
size...so is there no checks at all that the value in EAX is within
the bounds of the "sys_call_table"?

Beth

P.S. A weird thing just happened there...in the newsgroup list, the
address of your (hp's) post was listed as AOA and CLAX...but when I
did "reply", the address changed into ALA (which wasn't listed before)
and the other two disappeared somewhere...not sure what happened there
so I've reset the cross-posting "address" to include the full four
groups the OP originally posted to again...that's a mighty weird bug

  Reply With Quote
7 9th April 05:56
beth
External User
 
Posts: 1
Default MASM and TASM


No, you're not mistaken about that...I looked the general information
up just to verify my facts for the post I wrote in reply...Linux does
now have sysenter / sy***it support...but, according to the site I was
browsing, it's considered "experimental" and "unstable" at the
moment..."use at your own risk!" stuff...I actually saw that and meant
to add a small note onto my post to mention this new
possibility...but, ummm, I just plain forgot to put it in, in the end


[ That's another thing about general OS architecture...do we need to
have a user -> supervisor transitions and the such like for every
single API? Isn't this yet another redundency again? What I mean is,
not all of the kernel's API functions need be supervisor level (for
instance, why does Windows' "lstrlen" need priviledged access to I/O
port permissions and altering the GDT table and the such like, merely
to count how many bytes are in a user-supplied string?)...thus,
wouldn't it be more sensible to better limit and define things...you
know, only bother with making priviledge transitions and that sort of
thing when we really need to...

And, speaking of which, when do we really need to do that, anyway? We
don't really...the OS can have just a very small "core" section which
truely needs the priviledged access (it's the bit that sets up GDT
tables and grants I/O permisions, etc. ...and then _everything else_
is user code...device drivers included, as the "core" can just open up
"gaps" in the protections (the I/O permissions map to allow it to
do what it needs to do without arbitrarily handing over all ring0
powers and priviledges...it shouldn't just be better speed but it's
also surely a vastly more secure way to go about things too...that,
simply, _ALL_ code but a few "core" internal functions - that set up
memory tables and alter them and grant I/O permissions to other
applications needs the full supervisor permissions and stuff - is user
code...otherwise, the entire system is totally user mode ring3 code
from top to bottom...the "priviledged" stuff is modularised into a
small "core" internal thingy which really only needs to be called to
request particular changes to permissions...a device driver shouldn't
be needing "LGDT" or whatever, anyway, right? Seems daft to go the VxD
way and just give it free reign to do anything it likes...just open up
a "gap" in the I/O port map and / or map the relevent memory (frame
buffer or whatever) to be accessible and that's all it needs...well,
that can be done from user mode code... ]

Beth
  Reply With Quote
8 9th April 05:56
hp
External User
 
Posts: 1
Default MASM and TASM


I wouldn't ever recommend doing so! the above was only cited to convince you
that the entry is not that un-efficient as one might assume.

basically, it is. but not that simple, though, because the entry procedure
checks for appropriate bounds (which requires a single op), saves regs and
cares for a few other things, i.e. signals, external int's ('atomicity')
etc. plus, the syscalls routines would even prevent access to unsuitable
addresses without serious consequences.

which certainly is not as fast as it could be, but prevents the ultimately
'fast' code: reset by a single false instr. imho, merely a matter of design
'philosophy' - which w. linux, for instance, does not give you those stupid
"privilege violation"- or "restart the system"-like messages ever so often.

probably because I'd send a corrected version, after I'd delete the (my)
original message -?-

best,
hp --

x86 linux,assembler,forth: http://www.lxhp.in-berlin.de/index-lx.shtml
  Reply With Quote
Reply


Thread Tools
Display Modes




Copyright © 2006 SmartyDevil.com - Dies Mies Jeschet Boenedoesef Douvema Enitemaus -
666