Tuesday, 28 August 2012

Re: [gccsdk] Unaligned data access vs pack(push...

[Hope you don't mind me keeping this on the mailing list]

On Tue, Aug 28, 2012 at 01:04:51PM +1200, Ron wrote:
> I'm not sure at what point the compiled program would switch on the
> feature and if it switches it off, or if it expects the machine to be
> in that mode already. If you are polling/multitasking it would have to
> switch back and forward, which might not be possible either.

If it's a compiler feature, you either get it for the whole program or you
don't get it at all. The compiled program doesn't mess about with the CPU
settings, it expects the mode to be already set (and falls over otherwise).

Setting on a per-process basis has been suggested, but nobody's written
anything to do it. Indeed, the OS is completely unaware of the issue - it
doesn't even provide any means to change the state.

> Possibly, this seems to be the area for software vs hardware bitcounting
>
> #if defined(_MSC_VER) && !defined(LZ4_FORCE_SW_BITCOUNT)
> unsigned long r = 0;
> _BitScanForward64( &r, val );
> return (int)(r>>3);
> #elif defined(__GNUC__) && ((__GNUC__ * 100 + __GNUC_MINOR__) >= 304) && !defined(LZ4_FORCE_SW_BITCOUNT)
> return (__builtin_ctzll(val) >> 3);
> #else
> static const int DeBruijnBytePos[64] = { 0, 0, 0, 0, 0, 1, 1, 2, 0, 3, 1, 3, 1, 4, 2, 7, 0, 2, 3, 6, 1, 5, 3, 5, 1, 3, 4, 4, 2, 5, 6, 7, 7, 0, 1, 2, 3, 3, 4, 6, 2, 6, 5, 5, 3, 4, 5, 6, 7, 1, 2, 4, 6, 4, 4, 5, 7, 2, 6, 5, 7, 6, 7, 7 };
> return DeBruijnBytePos[((U64)((val & -val) * 0x0218A392CDABBD3F)) >> 58];
>

No comments:

Post a Comment