Ramanarayanan, Ramshankar | 1 Jul 07:14 2011
Picon

Re: code review request for CG SIB default at O3

It has been a bit since I sent this out. Could someone take up this code review?

 

Ram

 

From: Ramanarayanan, Ramshankar
Sent: Monday, June 06, 2011 5:36 PM
To: 'open64-devel <at> lists.sourceforge.net'
Subject: code review request for CG SIB default at O3

 

I would like a code review for the attached diff.

 

This diff enables more control to X8664 “Scale, Index, Base” (SIB) flag CG_merge_counters_x86 with the BOOL variable CG_merge_counters_x86_set. Using this hook, we proceed to set CG_merge_counters_x86 at O3 thereby enabling default SIB at O3.

 

This change has been tested with SPEC 2006 and other FP applications.

 

Best regards,

Ram

 

Member of Technical Staff,

Open Source Compiler Engineering,

Advanced Micro Devices

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Open64-devel mailing list
Open64-devel <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel
Jian-Xin Lai | 1 Jul 07:32 2011
Picon

Re: Code review request for vectorizer patch [LNO] [CG]

The changes in whirl2.p (the enhancement on OPR_SELECT and changes in
CG EXP phase) look fine to me.

2011/6/29 Mathew, Pallavi <Pallavi.Mathew <at> amd.com>:
> Hi Fred, Mei and CG gatekeeper,
> Can you please take a look at the attached revised and re-tested patches that introduce a framework for
vectorization of IF-statements?
>
> whirl3.p: revised patch that extends the supported types of OPR_SELECT to include V16I1 (to support
generation of blend instruction). The patch also extends rtype/desc of OPR_EQ to include V16I8.
> vectorizer3.p: the only update to this patch is that the use of OPR_BLEND is replaced with OPR_SELECT.
>
> Thanks.
> Pallavi
>
> -----Original Message-----
> From: Fred Chow [mailto:frdchow <at> gmail.com]
> Sent: Wednesday, June 15, 2011 10:56 PM
> To: Mathew, Pallavi
> Cc: open64-devel <at> lists.sourceforge.net
> Subject: Re: [Open64-devel] Code review request for vectorizer patch [LNO] [CG]
>
> I see no reason why SELECT cannot be used to represent BLEND by
> extending its supported types.  New operators should be avoided whenever
> possible, to avoid proliferation of variants of WHIRL.  Following is my
> enhancement of the definition of SELECT, assuming V16 is what you need now:
>
> SELECT            res=i,f,V16    desc=B,i,V16                [H-VL]
> Both Kid 1 and Kid 2 must have res as their result types.  If Kid 0 is
> not type V16, return Kid 1 if Kid 0 evaluates to true; otherwise, return
> Kid 2.  If Kid 0 is type V16, both Kids 1 and 2 must also be type V16,
> and the resulting V16 vector has its elements selected from the
> corresponding elements of either Kid 1 or 2 based on their corresponding
> boolean elements in Kid 0.  The evaluation of both Kids 1 and 2 can be
> performed regardless of the value of Kid 0.  Converting an if statement
> to this operation is tantamount to speculation if Kid 1 or 2 are
> expressions.
>
> Fred
>
> On 06/15/2011 04:45 PM, Mathew, Pallavi wrote:
>> Hi Fred,
>> Couple of points in favor of the new operator:
>> 1. Semantically the BLEND operation is like a concatenation operation combining components from
separate sources while the SELECT operation selects one or the other source. I realize though that this
can be argued against by considering blend as a packed or vectorized select.
>> 2. While the SELECT operation places restriction on the result type of its kids (kid1/2 must match the
result type of select and kid0 must evaluate to a boolean expression), the BLEND has no such constraints
and interprets its kids as 128-bit vectors.
>>
>> The patch proposes V16I1V16I1BLEND for pblendvb, but to accommodate all other flavors of the blend*
operation, the range of rtype/desc will have to be extended. In other words, the new operation is intended
to represent not one, but a set of blend operations.
>>
>> Thanks.
>> Pallavi
>>
>> -----Original Message-----
>> From: Fred Chow [mailto:frdchow <at> gmail.com]
>> Sent: Wednesday, June 15, 2011 2:56 PM
>> To: Mathew, Pallavi
>> Cc: open64-devel <at> lists.sourceforge.net
>> Subject: Re: [Open64-devel] Code review request for vectorizer patch [LNO] [CG]
>>
>> Instead of defining the new operator BLEND, why don't you just extend
>> the existing SELECT operator by allowing it to use V16 types?  This is
>> what we have been doing when we started to support SIMD types for
>> various operations for the x86 target.
>>
>> Fred
>>
>> On 06/15/2011 02:18 PM, Mathew, Pallavi wrote:
>>> Here is the definition:
>>>
>>> OPR_BLEND:
>>> Ternary operator with rtype=V16I1 and desc=V16I1.
>>> Each kid is interpreted as a 128-bit vector of 16 bytes.
>>> The result is also a 128-bit vector of bytes computed by
>>> selecting bytes from either kid0 or kid1 depending upon
>>> value of the mask in kid2 as shown below:
>>> if (msb(kid2[n]) == 0)
>>> then
>>>      result[n] = kid0[n]
>>> else
>>>      result[n] = kid1[n]
>>>
>>> where msb(x) represents the most significant bit of x.
>>> and foo[n] represent the n-th byte of foo.
>>>
>>> -Pallavi
>>>
>>>
>>> -----Original Message-----
>>> From: Fred Chow [mailto:frdchow <at> gmail.com]
>>> Sent: Wednesday, June 15, 2011 10:41 AM
>>> To: Sun Chan
>>> Cc: Mathew, Pallavi; open64-devel <at> lists.sourceforge.net
>>> Subject: Re: [Open64-devel] Code review request for vectorizer patch [LNO] [CG]
>>>
>>> Instead of look at source code and guessing at the definition of
>>> OPR_BLEND, is there some document where we can describe the definition
>>> of new WHIRL opcodes?
>>>
>>> Fred
>>>
>>> On 06/14/2011 02:51 PM, Sun Chan wrote:
>>>> This is a WHIRL change, Fred, can you look at the change (at least the
>>>> new OPR_BLEND) to see if that is ok?
>>>> Sun
>>>>
>>>> On Tue, Jun 14, 2011 at 1:39 PM, Mathew, Pallavi<Pallavi.Mathew <at> amd.com>    wrote:
>>>>> Hi,
>>>>>
>>>>> Can a gatekeeper please review the attached patch which introduces a
>>>>> framework for vectorization of IF-statements of the form:
>>>>>
>>>>>      - if (x != 0) { single_istore_statement } else {empty_body}
>>>>>
>>>>>      - if (x != 0) { if (y !=0) {single_istore_statement} else {empty_body}}
>>>>> else {empty_body}
>>>>>
>>>>>
>>>>>
>>>>> Sample program:
>>>>>
>>>>> long array[2000000];
>>>>>
>>>>> void sample() {
>>>>>
>>>>>      long i;
>>>>>
>>>>>      for(i = 0; i<    2000000; i++) {
>>>>>
>>>>>        if (array[i])
>>>>>
>>>>>                    array[i] ^= i;
>>>>>
>>>>>      }
>>>>>
>>>>>      return;
>>>>>
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>> Vectorization of such if-statement is done by first vectorizing its
>>>>> subexpressions.
>>>>>
>>>>> The result of the vectorized if-condition is computed by 'pcmpeqq' (V16I8EQ)
>>>>> and
>>>>>
>>>>> is used to select between the result of executing the statement in the
>>>>> if-body or
>>>>>
>>>>> leaving the array element unchanged. This selection is performed by the
>>>>>
>>>>> 'pblendvb' (V16I1V16I1BLEND) operation. Both of these are SSE4.1
>>>>> instructions.
>>>>>
>>>>> We introduce a new whirl operation OPR_BLEND which eventually gets
>>>>> translated to 'pblendvb'.
>>>>>
>>>>> This optimization is turned on by default and can be controlled by
>>>>> -LNO:simd_vect_if={on/off}.
>>>>>
>>>>> This patch also recognizes and handles vectorization of invariants rooted at
>>>>> OPR_ADD, OPR_SUB and OPR_MPY.
>>>>>
>>>>>
>>>>>
>>>>> Files updated by this patch:
>>>>>
>>>>> osprey/be/lno/simd.cxx
>>>>>
>>>>>     - vectorization of if-statements.
>>>>>
>>>>>     - vectorization of invariants rooted at OPR_ADD, OPR_SUB and OPR_MPY.
>>>>>
>>>>> osprey/common/com/config_lno.h
>>>>>
>>>>> osprey/common/com/config_lno.cxx
>>>>>
>>>>>     - add flag to control vectorization of if-statements
>>>>>
>>>>> osprey/common/com/opcode_gen_core.h
>>>>>
>>>>>     - add the OPR_BLEND operation and its opcode.
>>>>>
>>>>>     - add opcode for vectorized version of OPR_EQ.
>>>>>
>>>>> osprey/common/com/opcode_gen_core.cxx
>>>>>
>>>>>     - specify number of kids, desc, rtype, property(expression), etc. for
>>>>> OPR_BLEND.
>>>>>
>>>>> osprey/common/com/wn.cxx
>>>>>
>>>>>     - enable WN_has_side_effects for opr_blend
>>>>>
>>>>> osprey/be/opt/opt_bdce.cxx
>>>>>
>>>>>     - propagate live bits down the expression tree.
>>>>>
>>>>> osprey/be/cg/whirl2ops.cxx
>>>>>
>>>>>     - expand expression containing OPR_BLEND
>>>>>
>>>>>     - add handler for blend.
>>>>>
>>>>> osprey/be/cg/cgexp_internals.h
>>>>>
>>>>>     - add declaration of Expand_Blend
>>>>>
>>>>> osprey/be/cg/x8664/expand.cxx
>>>>>
>>>>>     - expand OPC_BLEND to expression containing TNs
>>>>>
>>>>> osprey/be/com/x8664/betarget.cxx
>>>>>
>>>>>     - convert opcode to TOP... (return top_blend...)
>>>>>
>>>>> osprey/common/com/wn_util.h
>>>>>
>>>>> osprey/common/com/wn_util.cxx
>>>>>
>>>>>     - add utility routine.
>>>>>
>>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Pallavi
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> EditLive Enterprise is the world's most technically advanced content
>>>>> authoring tool. Experience the power of Track Changes, Inline Image
>>>>> Editing and ensure content is compliant with Accessibility Checking.
>>>>> http://p.sf.net/sfu/ephox-dev2dev
>>>>> _______________________________________________
>>>>> Open64-devel mailing list
>>>>> Open64-devel <at> lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/open64-devel
>>>>>
>>>>>
>>>> ------------------------------------------------------------------------------
>>>> EditLive Enterprise is the world's most technically advanced content
>>>> authoring tool. Experience the power of Track Changes, Inline Image
>>>> Editing and ensure content is compliant with Accessibility Checking.
>>>> http://p.sf.net/sfu/ephox-dev2dev
>>>> _______________________________________________
>>>> Open64-devel mailing list
>>>> Open64-devel <at> lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/open64-devel
>>>
>>
>> ------------------------------------------------------------------------------
>> EditLive Enterprise is the world's most technically advanced content
>> authoring tool. Experience the power of Track Changes, Inline Image
>> Editing and ensure content is compliant with Accessibility Checking.
>> http://p.sf.net/sfu/ephox-dev2dev
>> _______________________________________________
>> Open64-devel mailing list
>> Open64-devel <at> lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/open64-devel
>>
>>
>
>
>
> ------------------------------------------------------------------------------
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2d-c2
> _______________________________________________
> Open64-devel mailing list
> Open64-devel <at> lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/open64-devel
>
>

--

-- 
Regards,
Lai Jian-Xin

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
Jian-Xin Lai | 1 Jul 07:33 2011
Picon

Re: Gatekeeper code review request for AMD 4.2.5 merge

This patch looks fine to me.

2011/6/24 Berg, Michael <michael.berg <at> amd.com>:
> All:  The attached patch files are:
>
>
>
> 1.)    Correctness changes for cvt operations
>
> 2.)    BD translation mapping updates for a set of instructions(AVX)
>
> 3.)    Scheduling additions for new spelling of ordered cmp instructions
>
> 4.)    And the definitions of those instructions in the machine description
> files.
>
>
>
> Can a Gatekeeper please review these changes which are part of our ongoing
> effort for BD support and code quality.
>
>
>
> These changes pass the following:
>
> a.)           No compile time failure for x86 build.
>
> b.)          The gcc regression test suite on x86/Linux with no new
> failures.
>
> c.)           The SPEC2006 test suite at with current AMD 1 copy config at
> both base and peak.
>
>
>
> Thx,
>
>
>
> m
>
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense..
> http://p.sf.net/sfu/splunk-d2d-c1
> _______________________________________________
> Open64-devel mailing list
> Open64-devel <at> lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/open64-devel
>
>

--

-- 
Regards,
Lai Jian-Xin

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
Jian-Xin Lai | 1 Jul 07:35 2011
Picon

Re: Code review for fixing Assertion failure with -fpic -O1 [CG]

Looks fine to me. Please go ahead.

2011/6/30 Wu Yongchong <wuyongchong <at> gmail.com>:
> Can a gate keeper help review this patch
>
> The compiler build with --with-build-optimize=DEBUG
> See the bug description at  https://bugs.open64.net/show_bug.cgi?id=823
>
> The problem is that we were calling:
>    GRA_LIVE_Recalc_Liveness(region ? REGION_get_rid( rwn) : NULL);
>    GRA_LIVE_Rename_TNs();
> at +O1, when we should not.  As a result, LRA encountered an
> unexpected GTN and asserted.
> The fix is to guard this code with:
>  if (!CG_localize_tns) {
> CG_localize_tns is set when the optimization level is <= 1.
>
> here is the patch
>
> Index: osprey/be/cg/cg.cxx
> ===================================================================
> --- osprey/be/cg/cg.cxx (revision 3663)
> +++ osprey/be/cg/cg.cxx (working copy)
>  <at>  <at>  -1487,8 +1487,10  <at>  <at> 
>   Check_for_Dump_ALL ( TP_CGEXP, NULL, "Pre LIS" );
>
>  #else
> -  GRA_LIVE_Recalc_Liveness(region ? REGION_get_rid( rwn) : NULL);
> -  GRA_LIVE_Rename_TNs();
> +  if (!CG_localize_tns) {
> +    GRA_LIVE_Recalc_Liveness(region ? REGION_get_rid( rwn) : NULL);
> +    GRA_LIVE_Rename_TNs();
> +  }
>  #if !defined(TARG_PPC32)    //  PPC IGLS_Schedule_Region bugs
>   IGLS_Schedule_Region (TRUE /* before register allocation */);
>  #ifdef TARG_X8664
>
> yongchong
>
> ------------------------------------------------------------------------------
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2d-c2
> _______________________________________________
> Open64-devel mailing list
> Open64-devel <at> lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/open64-devel
>

--

-- 
Regards,
Lai Jian-Xin

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
Jian-Xin Lai | 1 Jul 07:36 2011
Picon

Re: code review request for CG SIB default at O3

This patch looks fine to me.

2011/6/6 Ramanarayanan, Ramshankar <Ramshankar.Ramanarayanan <at> amd.com>:
> I would like a code review for the attached diff.
>
>
>
> This diff enables more control to X8664 “Scale, Index, Base” (SIB) flag
> CG_merge_counters_x86 with the BOOL variable CG_merge_counters_x86_set.
> Using this hook, we proceed to set CG_merge_counters_x86 at O3 thereby
> enabling default SIB at O3.
>
>
>
> This change has been tested with SPEC 2006 and other FP applications.
>
>
>
> Best regards,
>
> Ram
>
>
>
> Member of Technical Staff,
>
> Open Source Compiler Engineering,
>
> Advanced Micro Devices
>
> ------------------------------------------------------------------------------
> Simplify data backup and recovery for your virtual environment with vRanger.
> Installation's a snap, and flexible recovery options mean your data is safe,
> secure and there when you need it. Discover what all the cheering's about.
> Get your free trial download today.
> http://p.sf.net/sfu/quest-dev2dev2
> _______________________________________________
> Open64-devel mailing list
> Open64-devel <at> lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/open64-devel
>
>

--

-- 
Regards,
Lai Jian-Xin

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
Hui Shi | 1 Jul 07:41 2011
Picon

Re: Code review for bug#707, liverange overlap after setup CODEMAP [WOPT]

Add an updated patch.

After discuss with An Xiaomi. Remove An's previous fix for bug#707.

On Thu, Jun 30, 2011 at 10:58 PM, Ye, Mei <Mei.Ye <at> amd.com> wrote:

I am on vacation today and tomorrow.  Can other gatekeepers review this? Thx.

 

-Mei

 

From: Hui Shi [mailto:kalin.shi <at> gmail.com]
Sent: Wednesday, June 29, 2011 11:36 PM
To: open64-devel <at> lists.sourceforge.net
Subject: [Open64-devel] Code review for bug#707, liverange overlap after setup CODEMAP [WOPT]

 

Could gatekeeper help review this fix?

In SSA::Create_CODEMAP ILOAD folding is performed during coderep setup.
The problem is iload folding may revive some dead phi/chi node.

For example
STMT1
sym2v4 = chi(sym2v3)  NOT LIVE
default_vsym_v4 = chi(default_vsym_v3) LIVE
....
STMT2
  ILOAD mu(default_vsym_v4)
...
STMT3
sym2v5 = chi(sym2v3) LIVE // sym2v3 is propagate down from above dead chi.
default_vsym_v5 = chi(default_vsym_v4)

If sym2v4 may use at STMT2, ILOAD's MU node must on STMT1's chi list
(because MU node must alias with sym2, otherwise its alias issue).
And MU node's corresponding CHI node must live on STMT1, because set_required_mu will set chi node live.
So it's okay for DSE to mark fisrt CHI dead if STMT1 and STMT3 write same memory location.
1. It will not make STMT1 dead, because the other chi node is live.
2. ILOAD's MU node's def statment still STMT1

If ILOAD folding transform ILOAD to LDID and LDID's AUX is sym2.
ILOD_FOLDING revives dead chi sym2v4 = chi(sym2v3) without update the late chi node's opnd.

This introduce overlapped between sym2v3 and sym2v4.
Fix is rename CODEMAP if iload folding revive dead phi/chi node.
OPT_REVISE_SSA::Fold_lda_indirects also rename CODEMAP after optimization.

Fix is
1. record in htable, if dead chi/phi node is revived in SSA::Create_CODEMAP.
   This only happends when ILOAD_Folding happens and find corresponding chi/phi is dead.
2. After SSA::Create_CODEMAP, rename coderep if dead chi/phi node is revive in CODEMAP setup.

Different with fixing in DSE to mark first chi node live, this fix not affect DSE, so not affect performance.
And this is verified with spec2006.rate on xeon.

Regards
Shi Hui


Attachment (dse.patch): application/octet-stream, 5365 bytes
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Open64-devel mailing list
Open64-devel <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel
svn | 1 Jul 10:06 2011
Picon

r3667 - trunk/osprey/be/cg

Author: rramanar
Date: 2011-07-01 04:06:42 -0400 (Fri, 01 Jul 2011)
New Revision: 3667

Modified:
   trunk/osprey/be/cg/cg_flags.cxx
   trunk/osprey/be/cg/cg_flags.h
   trunk/osprey/be/cg/cgdriver.cxx
Log:

This diff enables more control to X8664 - Scale, Index, Base - (SIB) flag
CG_merge_counters_x86 with the BOOL variable CG_merge_counters_x86_set.
Using this hook, we proceed to set CG_merge_counters_x86 at O3 thereby
enabling default SIB at O3.

This change has been tested with SPEC 2006 int/fp and other FP applications.

Code reviewed by Jian-Xin Lai (CG domain gatekeeper)

Modified: trunk/osprey/be/cg/cg_flags.cxx
===================================================================
--- trunk/osprey/be/cg/cg_flags.cxx	2011-06-30 02:34:59 UTC (rev 3666)
+++ trunk/osprey/be/cg/cg_flags.cxx	2011-07-01 08:06:42 UTC (rev 3667)
 <at>  <at>  -127,6 +127,7  <at>  <at> 
 BOOL CG_LOOP_nounroll_best_fit_set = FALSE;
 BOOL CG_strcmp_expand = TRUE;
 BOOL CG_merge_counters_x86 = FALSE;
+BOOL CG_merge_counters_x86_set = FALSE;
 BOOL CG_interior_ptrs_x86 = FALSE;
 BOOL CG_NoClear_Avx_Simd = FALSE;
 #endif

Modified: trunk/osprey/be/cg/cg_flags.h
===================================================================
--- trunk/osprey/be/cg/cg_flags.h	2011-06-30 02:34:59 UTC (rev 3666)
+++ trunk/osprey/be/cg/cg_flags.h	2011-07-01 08:06:42 UTC (rev 3667)
 <at>  <at>  -538,6 +538,7  <at>  <at> 
 extern BOOL CG_branch_fuse;
 extern BOOL CG_strcmp_expand;
 extern BOOL CG_merge_counters_x86;
+extern BOOL CG_merge_counters_x86_set;
 extern BOOL CG_interior_ptrs_x86;  // enable,disable interior pointer trans
 extern BOOL CG_NoClear_Avx_Simd;
 #endif

Modified: trunk/osprey/be/cg/cgdriver.cxx
===================================================================
--- trunk/osprey/be/cg/cgdriver.cxx	2011-06-30 02:34:59 UTC (rev 3666)
+++ trunk/osprey/be/cg/cgdriver.cxx	2011-07-01 08:06:42 UTC (rev 3667)
 <at>  <at>  -481,7 +481,7  <at>  <at> 
   { OVK_BOOL,   OV_VISIBLE, TRUE, "strcmp_expand", "",
     0, 0, 0,    &CG_strcmp_expand, NULL },
   { OVK_BOOL,   OV_VISIBLE, TRUE, "merge_counters_x86", "",
-    0, 0, 0,    &CG_merge_counters_x86, NULL },
+    0, 0, 0,    &CG_merge_counters_x86, &CG_merge_counters_x86_set },
   { OVK_BOOL,   OV_VISIBLE, TRUE, "interior_ptrs", "",
     0, 0, 0,    &CG_interior_ptrs_x86, NULL },
   { OVK_BOOL,   OV_VISIBLE, TRUE, "noavx_clear", "",
 <at>  <at>  -2659,6 +2659,8  <at>  <at> 
         // for unrolled loops.
         CG_merge_counters_x86 = TRUE;
         LOCS_PRE_Enable_Unroll_RegPressure_Sched = TRUE;
+      } else if (CG_opt_level == 3 && CG_merge_counters_x86_set == FALSE) {
+        CG_merge_counters_x86 = TRUE;
       }
     }
 #endif //TARG_X8664

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
svn | 1 Jul 18:46 2011
Picon

r3668 - in trunk/osprey/common/targ_info: isa/x8664 proc/x8664

Author: mberg
Date: 2011-07-01 12:46:18 -0400 (Fri, 01 Jul 2011)
New Revision: 3668

Modified:
   trunk/osprey/common/targ_info/isa/x8664/isa.cxx
   trunk/osprey/common/targ_info/isa/x8664/isa_operands.cxx
   trunk/osprey/common/targ_info/isa/x8664/isa_pack.cxx
   trunk/osprey/common/targ_info/isa/x8664/isa_print.cxx
   trunk/osprey/common/targ_info/isa/x8664/isa_properties.cxx
   trunk/osprey/common/targ_info/isa/x8664/isa_subset.cxx
   trunk/osprey/common/targ_info/proc/x8664/barcelona_si.cxx
   trunk/osprey/common/targ_info/proc/x8664/core_si.cxx
   trunk/osprey/common/targ_info/proc/x8664/em64t_si.cxx
   trunk/osprey/common/targ_info/proc/x8664/opteron_si.cxx
   trunk/osprey/common/targ_info/proc/x8664/orochi_si.cxx
   trunk/osprey/common/targ_info/proc/x8664/wolfdale_si.cxx
Log:
Correcting cvt operations, added BD translation map updates, and
scheduling info updates as well as the addition ordered compares.

CR by Jian-Xin.

Modified: trunk/osprey/common/targ_info/isa/x8664/isa.cxx
===================================================================
--- trunk/osprey/common/targ_info/isa/x8664/isa.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/isa/x8664/isa.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -1635,6 +1635,7  <at>  <at> 
 	      "cmpnltss",
 	      "cmpnless",
 	      "cmpordss",
+ 
 	      "emms",
 	      "stmxcsr",
 	      "ldmxcsr",
 <at>  <at>  -2387,14 +2388,14  <at>  <at> 
               "vfcmpx128v32",
               "vfcmpxx128v32",
               "vfcmpxxx128v32",
-              "vfcmpsd",
-              "vfcmpxsd",
-              "vfcmpxxsd",
-              "vfcmpxxxsd",
-              "vfcmpss",
-              "vfcmpxss",
-              "vfcmpxxss",
-              "vfcmpxxxss",
+              "vcmpsd",
+              "vcmpxsd",
+              "vcmpxxsd",
+              "vcmpxxxsd",
+              "vcmpss",
+              "vcmpxss",
+              "vcmpxxss",
+              "vcmpxxxss",
               "vcomisd",
               "vcomixsd",
               "vcomixxsd",
 <at>  <at>  -3443,6 +3444,38  <at>  <at> 
               "vxzero128v32",
               "vxzero32",
               "vzeroall",
+	      "vcmpeqpd",
+	      "vcmpltpd",
+	      "vcmplepd",
+	      "vcmpunordpd",
+	      "vcmpneqpd",
+	      "vcmpnltpd",
+	      "vcmpnlepd",
+	      "vcmpordpd",
+	      "vcmpeqps",
+	      "vcmpltps",
+	      "vcmpleps",
+	      "vcmpunordps",
+	      "vcmpneqps",
+	      "vcmpnltps",
+	      "vcmpnleps",
+	      "vcmpordps",
+	      "vcmpeqsd",
+	      "vcmpltsd",
+	      "vcmplesd",
+	      "vcmpunordsd",
+	      "vcmpneqsd",
+	      "vcmpnltsd",
+	      "vcmpnlesd",
+	      "vcmpordsd",
+	      "vcmpeqss",
+	      "vcmpltss",
+	      "vcmpless",
+	      "vcmpunordss",
+	      "vcmpneqss",
+	      "vcmpnltss",
+	      "vcmpnless",
+	      "vcmpordss",

               /* INTEL FMA instructions */
               "xfmadd132pd",

Modified: trunk/osprey/common/targ_info/isa/x8664/isa_operands.cxx
===================================================================
--- trunk/osprey/common/targ_info/isa/x8664/isa_operands.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/isa/x8664/isa_operands.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -2335,12 +2335,6  <at>  <at> 
                            TOP_vfblend128v64,
                            TOP_vfblend128v32,
                            TOP_vpclmulqdq,
-                           TOP_vcmppd,
-                           TOP_vcmpps,
-                           TOP_vfcmp128v64,
-                           TOP_vfcmp128v32,
-                           TOP_vfcmpsd,
-                           TOP_vfcmpss,
                            TOP_vfdp128v64,
                            TOP_vfdp128v32,
                            TOP_vfinsrf128,
 <at>  <at>  -2385,8 +2379,8  <at>  <at> 
                            TOP_vpclmulqdqx,
                            TOP_vfcmpx128v64,
                            TOP_vfcmpx128v32,
-                           TOP_vfcmpxsd,
-                           TOP_vfcmpxss,
+                           TOP_vcmpxsd,
+                           TOP_vcmpxss,
                            TOP_vfdpx128v64,
                            TOP_vfdpx128v32,
                            TOP_vfinsrxf128,
 <at>  <at>  -2434,8 +2428,8  <at>  <at> 
                            TOP_vpclmulqdqxx,
                            TOP_vfcmpxx128v64,
                            TOP_vfcmpxx128v32,
-                           TOP_vfcmpxxsd,
-                           TOP_vfcmpxxss,
+                           TOP_vcmpxxsd,
+                           TOP_vcmpxxss,
                            TOP_vfdpxx128v64,
                            TOP_vfdpxx128v32,
                            TOP_vfinsrxx128v32,
 <at>  <at>  -2485,8 +2479,8  <at>  <at> 
                            TOP_vpclmulqdqxxx,
                            TOP_vfcmpxxx128v64,
                            TOP_vfcmpxxx128v32,
-                           TOP_vfcmpxxxsd,
-                           TOP_vfcmpxxxss,
+                           TOP_vcmpxxxsd,
+                           TOP_vcmpxxxss,
                            TOP_vfdpxxx128v64,
                            TOP_vfdpxxx128v32,
                            TOP_vfinsrxxxf128,
 <at>  <at>  -5282,6 +5276,8  <at>  <at> 
   Operand(1, x87,  opnd2);

   Instruction_Group("fp compare",
+                    TOP_vcmpsd,
+                    TOP_vcmpss,
 		    TOP_cmpss,
 		    TOP_cmpsd,
 		    TOP_UNDEFINED);
 <at>  <at>  -5290,6 +5286,7  <at>  <at> 
   Operand(1, fp64,  opnd2);
   Operand(2, simm8, opnd3);

+  // pseudo assembler ops that encode to cmpss, cmpsd, vcmpss and vcmpsd
   Instruction_Group("fp compare I",
 		    TOP_cmpeqsd,
 		    TOP_cmpltsd,
 <at>  <at>  -5307,6 +5304,24  <at>  <at> 
 		    TOP_cmpnltss,
 		    TOP_cmpnless,
 		    TOP_cmpordss,
+		    TOP_cmpordss,
+		    TOP_vcmpeqsd,
+		    TOP_vcmpltsd,
+		    TOP_vcmplesd,
+		    TOP_vcmpunordsd,
+		    TOP_vcmpneqsd,
+		    TOP_vcmpnltsd,
+		    TOP_vcmpnlesd,
+		    TOP_vcmpordsd,
+		    TOP_vcmpeqss,
+		    TOP_vcmpltss,
+		    TOP_vcmpless,
+		    TOP_vcmpunordss,
+		    TOP_vcmpneqss,
+		    TOP_vcmpnltss,
+		    TOP_vcmpnless,
+		    TOP_vcmpordss,
+		    TOP_vcmpordss,
 		    TOP_UNDEFINED);
   Result(0, fp64);
   Operand(0, fp64,  opnd1);
 <at>  <at>  -5365,6 +5380,10  <at>  <at> 
   Instruction_Group("fp vector compare",
 		    TOP_cmpps,
 		    TOP_cmppd,
+                    TOP_vcmppd,
+                    TOP_vcmpps,
+                    TOP_vfcmp128v64,
+                    TOP_vfcmp128v32,
 		    TOP_UNDEFINED);
   Result(0, fp128);
   Operand(0, fp128,  opnd1);
 <at>  <at>  -5388,6 +5407,22  <at>  <at> 
 		    TOP_cmpnltps,
 		    TOP_cmpnleps,
 		    TOP_cmpordps,
+		    TOP_vcmpeqpd,
+		    TOP_vcmpltpd,
+		    TOP_vcmplepd,
+		    TOP_vcmpunordpd,
+		    TOP_vcmpneqpd,
+		    TOP_vcmpnltpd,
+		    TOP_vcmpnlepd,
+		    TOP_vcmpordpd,
+		    TOP_vcmpeqps,
+		    TOP_vcmpltps,
+		    TOP_vcmpleps,
+		    TOP_vcmpunordps,
+		    TOP_vcmpneqps,
+		    TOP_vcmpnltps,
+		    TOP_vcmpnleps,
+		    TOP_vcmpordps,
 		    TOP_UNDEFINED);
   Result(0, fp128);
   Operand(0, fp128,  opnd1);

Modified: trunk/osprey/common/targ_info/isa/x8664/isa_pack.cxx
===================================================================
--- trunk/osprey/common/targ_info/isa/x8664/isa_pack.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/isa/x8664/isa_pack.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -2371,14 +2371,46  <at>  <at> 
                     TOP_vfcmpx128v32,   0x000000ff,
                     TOP_vfcmpxx128v32,  0x000000ff,
                     TOP_vfcmpxxx128v32, 0x000000ff,
-                    TOP_vfcmpsd,        0x000000ff,
-                    TOP_vfcmpxsd,       0x000000ff,
-                    TOP_vfcmpxxsd,      0x000000ff,
-                    TOP_vfcmpxxxsd,     0x000000ff,
-                    TOP_vfcmpss,        0x000000ff,
-                    TOP_vfcmpxss,       0x000000ff,
-                    TOP_vfcmpxxss,      0x000000ff,
-                    TOP_vfcmpxxxss,     0x000000ff,
+                    TOP_vcmpeqpd,       0x000000ff,
+                    TOP_vcmpltpd,       0x000000ff,
+                    TOP_vcmplepd,       0x000000ff,
+                    TOP_vcmpunordpd,    0x000000ff,
+                    TOP_vcmpneqpd,      0x000000ff,
+                    TOP_vcmpnltpd,      0x000000ff,
+                    TOP_vcmpnlepd,      0x000000ff,
+                    TOP_vcmpordpd,      0x000000ff,
+                    TOP_vcmpeqps,       0x000000ff,
+                    TOP_vcmpltps,       0x000000ff,
+                    TOP_vcmpleps,       0x000000ff,
+                    TOP_vcmpunordps,    0x000000ff,
+                    TOP_vcmpneqps,      0x000000ff,
+                    TOP_vcmpnltps,      0x000000ff,
+                    TOP_vcmpnleps,      0x000000ff,
+                    TOP_vcmpordps,      0x000000ff,
+                    TOP_vcmpeqss,       0x000000ff,
+                    TOP_vcmpltss,       0x000000ff,
+                    TOP_vcmpless,       0x000000ff,
+                    TOP_vcmpunordss,    0x000000ff,
+                    TOP_vcmpneqss,      0x000000ff,
+                    TOP_vcmpnltss,      0x000000ff,
+                    TOP_vcmpnless,      0x000000ff,
+                    TOP_vcmpordss,      0x000000ff,
+                    TOP_vcmpeqsd,       0x000000ff,
+                    TOP_vcmpltsd,       0x000000ff,
+                    TOP_vcmplesd,       0x000000ff,
+                    TOP_vcmpunordsd,    0x000000ff,
+                    TOP_vcmpneqsd,      0x000000ff,
+                    TOP_vcmpnltsd,      0x000000ff,
+                    TOP_vcmpnlesd,      0x000000ff,
+                    TOP_vcmpordsd,      0x000000ff,
+                    TOP_vcmpsd,         0x000000ff,
+                    TOP_vcmpxsd,        0x000000ff,
+                    TOP_vcmpxxsd,       0x000000ff,
+                    TOP_vcmpxxxsd,      0x000000ff,
+                    TOP_vcmpss,         0x000000ff,
+                    TOP_vcmpxss,        0x000000ff,
+                    TOP_vcmpxxss,       0x000000ff,
+                    TOP_vcmpxxxss,      0x000000ff,
                     TOP_vcomisd,        0x000000ff,
                     TOP_vcomixsd,       0x000000ff,
                     TOP_vcomixxsd,      0x000000ff,

Modified: trunk/osprey/common/targ_info/isa/x8664/isa_print.cxx
===================================================================
--- trunk/osprey/common/targ_info/isa/x8664/isa_print.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/isa/x8664/isa_print.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -1663,6 +1663,8  <at>  <at> 
                            TOP_vcvtsi2sdq,
                            TOP_vcvtsi2ss,
                            TOP_vcvtsi2ssq,
+                           TOP_vcvtsd2ss,
+                           TOP_vcvtss2sd,
                            TOP_vfdiv128v64,
                            TOP_vfdiv128v32,
                            TOP_vdivsd,
 <at>  <at>  -1797,6 +1799,38  <at>  <at> 
                            TOP_vaesdec,
                            TOP_vaesdeclast,
                            TOP_vaeskeygenassist,
+                           TOP_vcmpeqpd,
+                           TOP_vcmpltpd,
+                           TOP_vcmplepd,
+                           TOP_vcmpunordpd,
+                           TOP_vcmpneqpd,
+                           TOP_vcmpnltpd,
+                           TOP_vcmpnlepd,
+                           TOP_vcmpordpd,
+                           TOP_vcmpeqps,
+                           TOP_vcmpltps,
+                           TOP_vcmpleps,
+                           TOP_vcmpunordps,
+                           TOP_vcmpneqps,
+                           TOP_vcmpnltps,
+                           TOP_vcmpnleps,
+                           TOP_vcmpordps,
+                           TOP_vcmpeqss,
+                           TOP_vcmpltss,
+                           TOP_vcmpless,
+                           TOP_vcmpunordss,
+                           TOP_vcmpneqss,
+                           TOP_vcmpnltss,
+                           TOP_vcmpnless,
+                           TOP_vcmpordss,
+                           TOP_vcmpeqsd,
+                           TOP_vcmpltsd,
+                           TOP_vcmplesd,
+                           TOP_vcmpunordsd,
+                           TOP_vcmpneqsd,
+                           TOP_vcmpnltsd,
+                           TOP_vcmpnlesd,
+                           TOP_vcmpordsd,
                            TOP_UNDEFINED);

   /* dest=op(src1, memop), non-x86-style */
 <at>  <at>  -2523,8 +2557,8  <at>  <at> 
                            TOP_vfcmp128v32,
                            TOP_vcmppd,
                            TOP_vcmpps,
-                           TOP_vfcmpsd,
-                           TOP_vfcmpss,
+                           TOP_vcmpsd,
+                           TOP_vcmpss,
                            TOP_vfdp128v64,
                            TOP_vfdp128v32,
                            TOP_vinsr128v8,
 <at>  <at>  -2588,8 +2622,8  <at>  <at> 
                            TOP_vcmpistrmx,
                            TOP_vfcmpx128v64,
                            TOP_vfcmpx128v32,
-                           TOP_vfcmpxsd,
-                           TOP_vfcmpxss,
+                           TOP_vcmpxsd,
+                           TOP_vcmpxss,
                            TOP_vfdpx128v64,
                            TOP_vfdpx128v32,
                            TOP_vinsrx128v8,
 <at>  <at>  -2651,8 +2685,8  <at>  <at> 
                            TOP_vcmpistrmxx,
                            TOP_vfcmpxx128v64,
                            TOP_vfcmpxx128v32,
-                           TOP_vfcmpxxsd,
-                           TOP_vfcmpxxss,
+                           TOP_vcmpxxsd,
+                           TOP_vcmpxxss,
                            TOP_vfdpxx128v64,
                            TOP_vfdpxx128v32,
                            TOP_vinsrxx128v8,
 <at>  <at>  -2714,8 +2748,8  <at>  <at> 
                            TOP_vcmpistrmxxx,
                            TOP_vfcmpxxx128v64,
                            TOP_vfcmpxxx128v32,
-                           TOP_vfcmpxxxsd,
-                           TOP_vfcmpxxxss,
+                           TOP_vcmpxxxsd,
+                           TOP_vcmpxxxss,
                            TOP_vfdpxxx128v64,
                            TOP_vfdpxxx128v32,
                            TOP_vinsrxxx128v8,
 <at>  <at>  -3272,8 +3306,6  <at>  <at> 
 			   TOP_cvtpi2pd,
 			   TOP_cvtpd2pi,
 			   TOP_cvttpd2pi,
-                           TOP_vcvtsd2ss,
-                           TOP_vcvtss2sd,
 			   TOP_ldc32,
 			   TOP_ldc64,
 			   TOP_movabsq,

Modified: trunk/osprey/common/targ_info/isa/x8664/isa_properties.cxx
===================================================================
--- trunk/osprey/common/targ_info/isa/x8664/isa_properties.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/isa/x8664/isa_properties.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -1927,14 +1927,46  <at>  <at> 
                      TOP_vfcmpx128v32,
                      TOP_vfcmpxx128v32,
                      TOP_vfcmpxxx128v32,
-                     TOP_vfcmpsd,
-                     TOP_vfcmpxsd,
-                     TOP_vfcmpxxsd,
-                     TOP_vfcmpxxxsd,
-                     TOP_vfcmpss,
-                     TOP_vfcmpxss,
-                     TOP_vfcmpxxss,
-                     TOP_vfcmpxxxss,
+                     TOP_vcmpsd,
+                     TOP_vcmpxsd,
+                     TOP_vcmpxxsd,
+                     TOP_vcmpxxxsd,
+                     TOP_vcmpss,
+                     TOP_vcmpxss,
+                     TOP_vcmpxxss,
+                     TOP_vcmpxxxss,
+                     TOP_vcmpeqpd,
+                     TOP_vcmpltpd,
+                     TOP_vcmplepd,
+                     TOP_vcmpunordpd,
+                     TOP_vcmpneqpd,
+                     TOP_vcmpnltpd,
+                     TOP_vcmpnlepd,
+                     TOP_vcmpordpd,
+                     TOP_vcmpeqps,
+                     TOP_vcmpltps,
+                     TOP_vcmpleps,
+                     TOP_vcmpunordps,
+                     TOP_vcmpneqps,
+                     TOP_vcmpnltps,
+                     TOP_vcmpnleps,
+                     TOP_vcmpordps,
+                     TOP_vcmpeqss,
+                     TOP_vcmpltss,
+                     TOP_vcmpless,
+                     TOP_vcmpunordss,
+                     TOP_vcmpneqss,
+                     TOP_vcmpnltss,
+                     TOP_vcmpnless,
+                     TOP_vcmpordss,
+                     TOP_vcmpeqsd,
+                     TOP_vcmpltsd,
+                     TOP_vcmplesd,
+                     TOP_vcmpunordsd,
+                     TOP_vcmpneqsd,
+                     TOP_vcmpnltsd,
+                     TOP_vcmpnlesd,
+                     TOP_vcmpordsd,
                      TOP_vcomisd,
                      TOP_vcomixsd,
                      TOP_vcomixxsd,
 <at>  <at>  -6324,12 +6356,12  <at>  <at> 
                      TOP_vfperm2xf128,
                      TOP_vfperm2xxf128,
                      TOP_vfperm2xxxf128,
-                     TOP_vfcmpxsd,
-                     TOP_vfcmpxxsd,
-                     TOP_vfcmpxxxsd,
-                     TOP_vfcmpxss,
-                     TOP_vfcmpxxss,
-                     TOP_vfcmpxxxss,
+                     TOP_vcmpxsd,
+                     TOP_vcmpxxsd,
+                     TOP_vcmpxxxsd,
+                     TOP_vcmpxss,
+                     TOP_vcmpxxss,
+                     TOP_vcmpxxxss,
                      TOP_vroundxsd,
                      TOP_vroundxxsd,
                      TOP_vroundxxxsd,
 <at>  <at>  -7682,14 +7714,78  <at>  <at> 
                      TOP_vfcmpx128v32,
                      TOP_vfcmpxx128v32,
                      TOP_vfcmpxxx128v32,
-                     TOP_vfcmpsd,
-                     TOP_vfcmpxsd,
-                     TOP_vfcmpxxsd,
-                     TOP_vfcmpxxxsd,
-                     TOP_vfcmpss,
-                     TOP_vfcmpxss,
-                     TOP_vfcmpxxss,
-                     TOP_vfcmpxxxss,
+                     TOP_vcmpeqpd,
+                     TOP_vcmpltpd,
+                     TOP_vcmplepd,
+                     TOP_vcmpunordpd,
+                     TOP_vcmpneqpd,
+                     TOP_vcmpnltpd,
+                     TOP_vcmpnlepd,
+                     TOP_vcmpordpd,
+                     TOP_vcmpeqps,
+                     TOP_vcmpltps,
+                     TOP_vcmpleps,
+                     TOP_vcmpunordps,
+                     TOP_vcmpneqps,
+                     TOP_vcmpnltps,
+                     TOP_vcmpnleps,
+                     TOP_vcmpordps,
+                     TOP_vcmpeqss,
+                     TOP_vcmpltss,
+                     TOP_vcmpless,
+                     TOP_vcmpunordss,
+                     TOP_vcmpneqss,
+                     TOP_vcmpnltss,
+                     TOP_vcmpnless,
+                     TOP_vcmpordss,
+                     TOP_vcmpeqsd,
+                     TOP_vcmpltsd,
+                     TOP_vcmplesd,
+                     TOP_vcmpunordsd,
+                     TOP_vcmpneqsd,
+                     TOP_vcmpnltsd,
+                     TOP_vcmpnlesd,
+                     TOP_vcmpordsd,
+                     TOP_vcmpsd,
+                     TOP_vcmpxsd,
+                     TOP_vcmpxxsd,
+                     TOP_vcmpxxxsd,
+                     TOP_vcmpss,
+                     TOP_vcmpxss,
+                     TOP_vcmpxxss,
+                     TOP_vcmpxxxss,
+                     TOP_vcmpeqpd,
+                     TOP_vcmpltpd,
+                     TOP_vcmplepd,
+                     TOP_vcmpunordpd,
+                     TOP_vcmpneqpd,
+                     TOP_vcmpnltpd,
+                     TOP_vcmpnlepd,
+                     TOP_vcmpordpd,
+                     TOP_vcmpeqps,
+                     TOP_vcmpltps,
+                     TOP_vcmpleps,
+                     TOP_vcmpunordps,
+                     TOP_vcmpneqps,
+                     TOP_vcmpnltps,
+                     TOP_vcmpnleps,
+                     TOP_vcmpordps,
+                     TOP_vcmpeqss,
+                     TOP_vcmpltss,
+                     TOP_vcmpless,
+                     TOP_vcmpunordss,
+                     TOP_vcmpneqss,
+                     TOP_vcmpnltss,
+                     TOP_vcmpnless,
+                     TOP_vcmpordss,
+                     TOP_vcmpeqsd,
+                     TOP_vcmpltsd,
+                     TOP_vcmplesd,
+                     TOP_vcmpunordsd,
+                     TOP_vcmpneqsd,
+                     TOP_vcmpnltsd,
+                     TOP_vcmpnlesd,
+                     TOP_vcmpordsd,
                      TOP_vcomisd,
                      TOP_vcomixsd,
                      TOP_vcomixxsd,
 <at>  <at>  -10327,14 +10423,46  <at>  <at> 
                      TOP_vucomixss,
                      TOP_vucomixxss,
                      TOP_vucomixxxss,
-                     TOP_vfcmpsd,
-                     TOP_vfcmpxsd,
-                     TOP_vfcmpxxsd,
-                     TOP_vfcmpxxxsd,
-                     TOP_vfcmpss,
-                     TOP_vfcmpxss,
-                     TOP_vfcmpxxss,
-                     TOP_vfcmpxxxss,
+                     TOP_vcmpsd,
+                     TOP_vcmpxsd,
+                     TOP_vcmpxxsd,
+                     TOP_vcmpxxxsd,
+                     TOP_vcmpss,
+                     TOP_vcmpxss,
+                     TOP_vcmpxxss,
+                     TOP_vcmpxxxss,
+                     TOP_vcmpeqpd,
+                     TOP_vcmpltpd,
+                     TOP_vcmplepd,
+                     TOP_vcmpunordpd,
+                     TOP_vcmpneqpd,
+                     TOP_vcmpnltpd,
+                     TOP_vcmpnlepd,
+                     TOP_vcmpordpd,
+                     TOP_vcmpeqps,
+                     TOP_vcmpltps,
+                     TOP_vcmpleps,
+                     TOP_vcmpunordps,
+                     TOP_vcmpneqps,
+                     TOP_vcmpnltps,
+                     TOP_vcmpnleps,
+                     TOP_vcmpordps,
+                     TOP_vcmpeqss,
+                     TOP_vcmpltss,
+                     TOP_vcmpless,
+                     TOP_vcmpunordss,
+                     TOP_vcmpneqss,
+                     TOP_vcmpnltss,
+                     TOP_vcmpnless,
+                     TOP_vcmpordss,
+                     TOP_vcmpeqsd,
+                     TOP_vcmpltsd,
+                     TOP_vcmplesd,
+                     TOP_vcmpunordsd,
+                     TOP_vcmpneqsd,
+                     TOP_vcmpnltsd,
+                     TOP_vcmpnlesd,
+                     TOP_vcmpordsd,
                      TOP_vroundsd,
                      TOP_vroundxsd,
                      TOP_vroundxxsd,
 <at>  <at>  -16654,14 +16782,46  <at>  <at> 
                      TOP_vfperm2xf128,
                      TOP_vfperm2xxf128,
                      TOP_vfperm2xxxf128,
-                     TOP_vfcmpsd,
-                     TOP_vfcmpxsd,
-                     TOP_vfcmpxxsd,
-                     TOP_vfcmpxxxsd,
-                     TOP_vfcmpss,
-                     TOP_vfcmpxss,
-                     TOP_vfcmpxxss,
-                     TOP_vfcmpxxxss,
+                     TOP_vcmpsd,
+                     TOP_vcmpxsd,
+                     TOP_vcmpxxsd,
+                     TOP_vcmpxxxsd,
+                     TOP_vcmpss,
+                     TOP_vcmpxss,
+                     TOP_vcmpxxss,
+                     TOP_vcmpxxxss,
+                     TOP_vcmpeqpd,
+                     TOP_vcmpltpd,
+                     TOP_vcmplepd,
+                     TOP_vcmpunordpd,
+                     TOP_vcmpneqpd,
+                     TOP_vcmpnltpd,
+                     TOP_vcmpnlepd,
+                     TOP_vcmpordpd,
+                     TOP_vcmpeqps,
+                     TOP_vcmpltps,
+                     TOP_vcmpleps,
+                     TOP_vcmpunordps,
+                     TOP_vcmpneqps,
+                     TOP_vcmpnltps,
+                     TOP_vcmpnleps,
+                     TOP_vcmpordps,
+                     TOP_vcmpeqss,
+                     TOP_vcmpltss,
+                     TOP_vcmpless,
+                     TOP_vcmpunordss,
+                     TOP_vcmpneqss,
+                     TOP_vcmpnltss,
+                     TOP_vcmpnless,
+                     TOP_vcmpordss,
+                     TOP_vcmpeqsd,
+                     TOP_vcmpltsd,
+                     TOP_vcmplesd,
+                     TOP_vcmpunordsd,
+                     TOP_vcmpneqsd,
+                     TOP_vcmpnltsd,
+                     TOP_vcmpnlesd,
+                     TOP_vcmpordsd,
                      TOP_vroundsd,
                      TOP_vroundxsd,
                      TOP_vroundxxsd,

Modified: trunk/osprey/common/targ_info/isa/x8664/isa_subset.cxx
===================================================================
--- trunk/osprey/common/targ_info/isa/x8664/isa_subset.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/isa/x8664/isa_subset.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -2385,14 +2385,46  <at>  <at> 
                     TOP_vfcmpx128v32,
                     TOP_vfcmpxx128v32,
                     TOP_vfcmpxxx128v32,
-                    TOP_vfcmpsd,
-                    TOP_vfcmpxsd,
-                    TOP_vfcmpxxsd,
-                    TOP_vfcmpxxxsd,
-                    TOP_vfcmpss,
-                    TOP_vfcmpxss,
-                    TOP_vfcmpxxss,
-                    TOP_vfcmpxxxss,
+                    TOP_vcmpeqpd,
+                    TOP_vcmpltpd,
+                    TOP_vcmplepd,
+                    TOP_vcmpunordpd,
+                    TOP_vcmpneqpd,
+                    TOP_vcmpnltpd,
+                    TOP_vcmpnlepd,
+                    TOP_vcmpordpd,
+                    TOP_vcmpeqps,
+                    TOP_vcmpltps,
+                    TOP_vcmpleps,
+                    TOP_vcmpunordps,
+                    TOP_vcmpneqps,
+                    TOP_vcmpnltps,
+                    TOP_vcmpnleps,
+                    TOP_vcmpordps,
+                    TOP_vcmpeqss,
+                    TOP_vcmpltss,
+                    TOP_vcmpless,
+                    TOP_vcmpunordss,
+                    TOP_vcmpneqss,
+                    TOP_vcmpnltss,
+                    TOP_vcmpnless,
+                    TOP_vcmpordss,
+                    TOP_vcmpeqsd,
+                    TOP_vcmpltsd,
+                    TOP_vcmplesd,
+                    TOP_vcmpunordsd,
+                    TOP_vcmpneqsd,
+                    TOP_vcmpnltsd,
+                    TOP_vcmpnlesd,
+                    TOP_vcmpordsd,
+                    TOP_vcmpsd,
+                    TOP_vcmpxsd,
+                    TOP_vcmpxxsd,
+                    TOP_vcmpxxxsd,
+                    TOP_vcmpss,
+                    TOP_vcmpxss,
+                    TOP_vcmpxxss,
+                    TOP_vcmpxxxss,
                     TOP_vcomisd,
                     TOP_vcomixsd,
                     TOP_vcomixxsd,

Modified: trunk/osprey/common/targ_info/proc/x8664/barcelona_si.cxx
===================================================================
--- trunk/osprey/common/targ_info/proc/x8664/barcelona_si.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/proc/x8664/barcelona_si.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -3664,8 +3664,40  <at>  <at> 
                         TOP_vfcmp128v64,
                         TOP_vcmpps,
                         TOP_vfcmp128v32,
-                        TOP_vfcmpsd,
-                        TOP_vfcmpss,
+                        TOP_vcmpsd,
+                        TOP_vcmpss,
+                        TOP_vcmpeqpd,
+                        TOP_vcmpltpd,
+                        TOP_vcmplepd,
+                        TOP_vcmpunordpd,
+                        TOP_vcmpneqpd,
+                        TOP_vcmpnltpd,
+                        TOP_vcmpnlepd,
+                        TOP_vcmpordpd,
+                        TOP_vcmpeqps,
+                        TOP_vcmpltps,
+                        TOP_vcmpleps,
+                        TOP_vcmpunordps,
+                        TOP_vcmpneqps,
+                        TOP_vcmpnltps,
+                        TOP_vcmpnleps,
+                        TOP_vcmpordps,
+                        TOP_vcmpeqss,
+                        TOP_vcmpltss,
+                        TOP_vcmpless,
+                        TOP_vcmpunordss,
+                        TOP_vcmpneqss,
+                        TOP_vcmpnltss,
+                        TOP_vcmpnless,
+                        TOP_vcmpordss,
+                        TOP_vcmpeqsd,
+                        TOP_vcmpltsd,
+                        TOP_vcmplesd,
+                        TOP_vcmpunordsd,
+                        TOP_vcmpneqsd,
+                        TOP_vcmpnltsd,
+                        TOP_vcmpnlesd,
+                        TOP_vcmpordsd,
                         TOP_vfhadd128v64,
                         TOP_vfhadd128v32,
                         TOP_vfhsub128v64,
 <at>  <at>  -3703,12 +3735,12  <at>  <at> 
                         TOP_vfcmpx128v32,
                         TOP_vfcmpxx128v32,
                         TOP_vfcmpxxx128v32,
-                        TOP_vfcmpxsd,
-                        TOP_vfcmpxxsd,
-                        TOP_vfcmpxxxsd,
-                        TOP_vfcmpxss,
-                        TOP_vfcmpxxss,
-                        TOP_vfcmpxxxss,
+                        TOP_vcmpxsd,
+                        TOP_vcmpxxsd,
+                        TOP_vcmpxxxsd,
+                        TOP_vcmpxss,
+                        TOP_vcmpxxss,
+                        TOP_vcmpxxxss,
                         TOP_vfhaddx128v64,
                         TOP_vfhaddxx128v64,
                         TOP_vfhaddxxx128v64,

Modified: trunk/osprey/common/targ_info/proc/x8664/core_si.cxx
===================================================================
--- trunk/osprey/common/targ_info/proc/x8664/core_si.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/proc/x8664/core_si.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -3751,8 +3751,40  <at>  <at> 
                         TOP_vfcmp128v64,
                         TOP_vcmpps,
                         TOP_vfcmp128v32,
-                        TOP_vfcmpsd,
-                        TOP_vfcmpss,
+                        TOP_vcmpsd,
+                        TOP_vcmpss,
+                        TOP_vcmpeqpd,
+                        TOP_vcmpltpd,
+                        TOP_vcmplepd,
+                        TOP_vcmpunordpd,
+                        TOP_vcmpneqpd,
+                        TOP_vcmpnltpd,
+                        TOP_vcmpnlepd,
+                        TOP_vcmpordpd,
+                        TOP_vcmpeqps,
+                        TOP_vcmpltps,
+                        TOP_vcmpleps,
+                        TOP_vcmpunordps,
+                        TOP_vcmpneqps,
+                        TOP_vcmpnltps,
+                        TOP_vcmpnleps,
+                        TOP_vcmpordps,
+                        TOP_vcmpeqss,
+                        TOP_vcmpltss,
+                        TOP_vcmpless,
+                        TOP_vcmpunordss,
+                        TOP_vcmpneqss,
+                        TOP_vcmpnltss,
+                        TOP_vcmpnless,
+                        TOP_vcmpordss,
+                        TOP_vcmpeqsd,
+                        TOP_vcmpltsd,
+                        TOP_vcmplesd,
+                        TOP_vcmpunordsd,
+                        TOP_vcmpneqsd,
+                        TOP_vcmpnltsd,
+                        TOP_vcmpnlesd,
+                        TOP_vcmpordsd,
                         TOP_vfhadd128v64,
                         TOP_vfhadd128v32,
                         TOP_vfhsub128v64,
 <at>  <at>  -3790,12 +3822,12  <at>  <at> 
                         TOP_vfcmpx128v32,
                         TOP_vfcmpxx128v32,
                         TOP_vfcmpxxx128v32,
-                        TOP_vfcmpxsd,
-                        TOP_vfcmpxxsd,
-                        TOP_vfcmpxxxsd,
-                        TOP_vfcmpxss,
-                        TOP_vfcmpxxss,
-                        TOP_vfcmpxxxss,
+                        TOP_vcmpxsd,
+                        TOP_vcmpxxsd,
+                        TOP_vcmpxxxsd,
+                        TOP_vcmpxss,
+                        TOP_vcmpxxss,
+                        TOP_vcmpxxxss,
                         TOP_vfhaddx128v64,
                         TOP_vfhaddxx128v64,
                         TOP_vfhaddxxx128v64,

Modified: trunk/osprey/common/targ_info/proc/x8664/em64t_si.cxx
===================================================================
--- trunk/osprey/common/targ_info/proc/x8664/em64t_si.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/proc/x8664/em64t_si.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -3622,8 +3622,40  <at>  <at> 
                         TOP_vfcmp128v64,
                         TOP_vcmpps,
                         TOP_vfcmp128v32,
-                        TOP_vfcmpsd,
-                        TOP_vfcmpss,
+                        TOP_vcmpsd,
+                        TOP_vcmpss,
+                        TOP_vcmpeqpd,
+                        TOP_vcmpltpd,
+                        TOP_vcmplepd,
+                        TOP_vcmpunordpd,
+                        TOP_vcmpneqpd,
+                        TOP_vcmpnltpd,
+                        TOP_vcmpnlepd,
+                        TOP_vcmpordpd,
+                        TOP_vcmpeqps,
+                        TOP_vcmpltps,
+                        TOP_vcmpleps,
+                        TOP_vcmpunordps,
+                        TOP_vcmpneqps,
+                        TOP_vcmpnltps,
+                        TOP_vcmpnleps,
+                        TOP_vcmpordps,
+                        TOP_vcmpeqss,
+                        TOP_vcmpltss,
+                        TOP_vcmpless,
+                        TOP_vcmpunordss,
+                        TOP_vcmpneqss,
+                        TOP_vcmpnltss,
+                        TOP_vcmpnless,
+                        TOP_vcmpordss,
+                        TOP_vcmpeqsd,
+                        TOP_vcmpltsd,
+                        TOP_vcmplesd,
+                        TOP_vcmpunordsd,
+                        TOP_vcmpneqsd,
+                        TOP_vcmpnltsd,
+                        TOP_vcmpnlesd,
+                        TOP_vcmpordsd,
                         TOP_vfhadd128v64,
                         TOP_vfhadd128v32,
                         TOP_vfhsub128v64,
 <at>  <at>  -3661,12 +3693,12  <at>  <at> 
                         TOP_vfcmpx128v32,
                         TOP_vfcmpxx128v32,
                         TOP_vfcmpxxx128v32,
-                        TOP_vfcmpxsd,
-                        TOP_vfcmpxxsd,
-                        TOP_vfcmpxxxsd,
-                        TOP_vfcmpxss,
-                        TOP_vfcmpxxss,
-                        TOP_vfcmpxxxss,
+                        TOP_vcmpxsd,
+                        TOP_vcmpxxsd,
+                        TOP_vcmpxxxsd,
+                        TOP_vcmpxss,
+                        TOP_vcmpxxss,
+                        TOP_vcmpxxxss,
                         TOP_vfhaddx128v64,
                         TOP_vfhaddxx128v64,
                         TOP_vfhaddxxx128v64,

Modified: trunk/osprey/common/targ_info/proc/x8664/opteron_si.cxx
===================================================================
--- trunk/osprey/common/targ_info/proc/x8664/opteron_si.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/proc/x8664/opteron_si.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -3664,8 +3664,40  <at>  <at> 
                         TOP_vfcmp128v64,
                         TOP_vcmpps,
                         TOP_vfcmp128v32,
-                        TOP_vfcmpsd,
-                        TOP_vfcmpss,
+                        TOP_vcmpsd,
+                        TOP_vcmpss,
+                        TOP_vcmpeqpd,
+                        TOP_vcmpltpd,
+                        TOP_vcmplepd,
+                        TOP_vcmpunordpd,
+                        TOP_vcmpneqpd,
+                        TOP_vcmpnltpd,
+                        TOP_vcmpnlepd,
+                        TOP_vcmpordpd,
+                        TOP_vcmpeqps,
+                        TOP_vcmpltps,
+                        TOP_vcmpleps,
+                        TOP_vcmpunordps,
+                        TOP_vcmpneqps,
+                        TOP_vcmpnltps,
+                        TOP_vcmpnleps,
+                        TOP_vcmpordps,
+                        TOP_vcmpeqss,
+                        TOP_vcmpltss,
+                        TOP_vcmpless,
+                        TOP_vcmpunordss,
+                        TOP_vcmpneqss,
+                        TOP_vcmpnltss,
+                        TOP_vcmpnless,
+                        TOP_vcmpordss,
+                        TOP_vcmpeqsd,
+                        TOP_vcmpltsd,
+                        TOP_vcmplesd,
+                        TOP_vcmpunordsd,
+                        TOP_vcmpneqsd,
+                        TOP_vcmpnltsd,
+                        TOP_vcmpnlesd,
+                        TOP_vcmpordsd,
                         TOP_vfhadd128v64,
                         TOP_vfhadd128v32,
                         TOP_vfhsub128v64,
 <at>  <at>  -3703,12 +3735,12  <at>  <at> 
                         TOP_vfcmpx128v32,
                         TOP_vfcmpxx128v32,
                         TOP_vfcmpxxx128v32,
-                        TOP_vfcmpxsd,
-                        TOP_vfcmpxxsd,
-                        TOP_vfcmpxxxsd,
-                        TOP_vfcmpxss,
-                        TOP_vfcmpxxss,
-                        TOP_vfcmpxxxss,
+                        TOP_vcmpxsd,
+                        TOP_vcmpxxsd,
+                        TOP_vcmpxxxsd,
+                        TOP_vcmpxss,
+                        TOP_vcmpxxss,
+                        TOP_vcmpxxxss,
                         TOP_vfhaddx128v64,
                         TOP_vfhaddxx128v64,
                         TOP_vfhaddxxx128v64,

Modified: trunk/osprey/common/targ_info/proc/x8664/orochi_si.cxx
===================================================================
--- trunk/osprey/common/targ_info/proc/x8664/orochi_si.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/proc/x8664/orochi_si.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -3984,8 +3984,40  <at>  <at> 
                      TOP_vfcmp128v64,
                      TOP_vcmpps,
                      TOP_vfcmp128v32,
-                     TOP_vfcmpsd,
-                     TOP_vfcmpss,
+                     TOP_vcmpsd,
+                     TOP_vcmpss,
+                     TOP_vcmpeqpd,
+                     TOP_vcmpltpd,
+                     TOP_vcmplepd,
+                     TOP_vcmpunordpd,
+                     TOP_vcmpneqpd,
+                     TOP_vcmpnltpd,
+                     TOP_vcmpnlepd,
+                     TOP_vcmpordpd,
+                     TOP_vcmpeqps,
+                     TOP_vcmpltps,
+                     TOP_vcmpleps,
+                     TOP_vcmpunordps,
+                     TOP_vcmpneqps,
+                     TOP_vcmpnltps,
+                     TOP_vcmpnleps,
+                     TOP_vcmpordps,
+                     TOP_vcmpeqss,
+                     TOP_vcmpltss,
+                     TOP_vcmpless,
+                     TOP_vcmpunordss,
+                     TOP_vcmpneqss,
+                     TOP_vcmpnltss,
+                     TOP_vcmpnless,
+                     TOP_vcmpordss,
+                     TOP_vcmpeqsd,
+                     TOP_vcmpltsd,
+                     TOP_vcmplesd,
+                     TOP_vcmpunordsd,
+                     TOP_vcmpneqsd,
+                     TOP_vcmpnltsd,
+                     TOP_vcmpnlesd,
+                     TOP_vcmpordsd,
                      TOP_vfmax128v64,
                      TOP_vfmax128v32,
                      TOP_vfmaxsd,
 <at>  <at>  -4011,12 +4043,12  <at>  <at> 
                      TOP_vfcmpx128v32,
                      TOP_vfcmpxx128v32,
                      TOP_vfcmpxxx128v32,
-                     TOP_vfcmpxsd,
-                     TOP_vfcmpxxsd,
-                     TOP_vfcmpxxxsd,
-                     TOP_vfcmpxss,
-                     TOP_vfcmpxxss,
-                     TOP_vfcmpxxxss,
+                     TOP_vcmpxsd,
+                     TOP_vcmpxxsd,
+                     TOP_vcmpxxxsd,
+                     TOP_vcmpxss,
+                     TOP_vcmpxxss,
+                     TOP_vcmpxxxss,
                      TOP_vfmaxx128v64,
                      TOP_vfmaxxx128v64,
                      TOP_vfmaxxxx128v64,

Modified: trunk/osprey/common/targ_info/proc/x8664/wolfdale_si.cxx
===================================================================
--- trunk/osprey/common/targ_info/proc/x8664/wolfdale_si.cxx	2011-07-01 08:06:42 UTC (rev 3667)
+++ trunk/osprey/common/targ_info/proc/x8664/wolfdale_si.cxx	2011-07-01 16:46:18 UTC (rev 3668)
 <at>  <at>  -3751,8 +3751,40  <at>  <at> 
                         TOP_vfcmp128v64,
                         TOP_vcmpps,
                         TOP_vfcmp128v32,
-                        TOP_vfcmpsd,
-                        TOP_vfcmpss,
+                        TOP_vcmpsd,
+                        TOP_vcmpss,
+                        TOP_vcmpeqpd,
+                        TOP_vcmpltpd,
+                        TOP_vcmplepd,
+                        TOP_vcmpunordpd,
+                        TOP_vcmpneqpd,
+                        TOP_vcmpnltpd,
+                        TOP_vcmpnlepd,
+                        TOP_vcmpordpd,
+                        TOP_vcmpeqps,
+                        TOP_vcmpltps,
+                        TOP_vcmpleps,
+                        TOP_vcmpunordps,
+                        TOP_vcmpneqps,
+                        TOP_vcmpnltps,
+                        TOP_vcmpnleps,
+                        TOP_vcmpordps,
+                        TOP_vcmpeqss,
+                        TOP_vcmpltss,
+                        TOP_vcmpless,
+                        TOP_vcmpunordss,
+                        TOP_vcmpneqss,
+                        TOP_vcmpnltss,
+                        TOP_vcmpnless,
+                        TOP_vcmpordss,
+                        TOP_vcmpeqsd,
+                        TOP_vcmpltsd,
+                        TOP_vcmplesd,
+                        TOP_vcmpunordsd,
+                        TOP_vcmpneqsd,
+                        TOP_vcmpnltsd,
+                        TOP_vcmpnlesd,
+                        TOP_vcmpordsd,
                         TOP_vfhadd128v64,
                         TOP_vfhadd128v32,
                         TOP_vfhsub128v64,
 <at>  <at>  -3790,12 +3822,12  <at>  <at> 
                         TOP_vfcmpx128v32,
                         TOP_vfcmpxx128v32,
                         TOP_vfcmpxxx128v32,
-                        TOP_vfcmpxsd,
-                        TOP_vfcmpxxsd,
-                        TOP_vfcmpxxxsd,
-                        TOP_vfcmpxss,
-                        TOP_vfcmpxxss,
-                        TOP_vfcmpxxxss,
+                        TOP_vcmpxsd,
+                        TOP_vcmpxxsd,
+                        TOP_vcmpxxxsd,
+                        TOP_vcmpxss,
+                        TOP_vcmpxxss,
+                        TOP_vcmpxxxss,
                         TOP_vfhaddx128v64,
                         TOP_vfhaddxx128v64,
                         TOP_vfhaddxxx128v64,

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
svn | 1 Jul 18:52 2011
Picon

r3669 - in trunk/osprey/be/cg: . x8664

Author: mberg
Date: 2011-07-01 12:52:19 -0400 (Fri, 01 Jul 2011)
New Revision: 3669

Modified:
   trunk/osprey/be/cg/oputil.cxx
   trunk/osprey/be/cg/x8664/cgemit_targ.cxx
   trunk/osprey/be/cg/x8664/expand.cxx
Log:
2nd part of checkin containing cvt updates, translation table updates

CR by Jian-Xin

Modified: trunk/osprey/be/cg/oputil.cxx
===================================================================
--- trunk/osprey/be/cg/oputil.cxx	2011-07-01 16:46:18 UTC (rev 3668)
+++ trunk/osprey/be/cg/oputil.cxx	2011-07-01 16:52:19 UTC (rev 3669)
 <at>  <at>  -1885,12 +1885,6  <at>  <at> 
     {TOP_cmpeq128v8,        TOP_vcmpeq128v8},
     {TOP_cmpeq128v16,       TOP_vcmpeq128v16},
     {TOP_cmpeq128v32,       TOP_vcmpeq128v32},
-    {TOP_pcmpeqb,           TOP_vcmpeq128v8},
-    {TOP_pcmpeqw,           TOP_vcmpeq128v16},
-    {TOP_pcmpeqd,           TOP_vcmpeq128v32},
-    {TOP_pcmpgtb,           TOP_vcmpgt128v8},
-    {TOP_pcmpgtw,           TOP_vcmpgt128v16},
-    {TOP_pcmpgtd,           TOP_vcmpgt128v32},
     {TOP_frcp128v32,        TOP_vfrcp128v32},
     {TOP_fsqrt128v32,       TOP_vfsqrt128v32},
     {TOP_frsqrt128v32,      TOP_vfrsqrt128v32},
 <at>  <at>  -2128,41 +2122,48  <at>  <at> 
     {TOP_sqrtsd,            TOP_vfsqrtsd},
     {TOP_andnps,            TOP_vfandn128v32},
     {TOP_andnpd,            TOP_vfandn128v64},
-    {TOP_cmpss,             TOP_vfcmpss},
-    {TOP_cmpsd,             TOP_vfcmpsd},
-    {TOP_cmpps,             TOP_vfcmp128v32},
-    {TOP_cmppd,             TOP_vfcmp128v64},
-    {TOP_cmpeqps,           TOP_vfcmp128v32}, 
-    {TOP_cmpltps,           TOP_vfcmp128v32},
-    {TOP_cmpleps,           TOP_vfcmp128v32},
-    {TOP_cmpunordps,        TOP_vfcmp128v32},
-    {TOP_cmpneqps,          TOP_vfcmp128v32},
-    {TOP_cmpnltps,          TOP_vfcmp128v32},
-    {TOP_cmpnleps,          TOP_vfcmp128v32},
-    {TOP_cmpordps,          TOP_vfcmp128v32},
-    {TOP_cmpeqss,           TOP_vfcmpss},
-    {TOP_cmpltss,           TOP_vfcmpss},
-    {TOP_cmpless,           TOP_vfcmpss},
-    {TOP_cmpunordss,        TOP_vfcmpss},
-    {TOP_cmpneqss,          TOP_vfcmpss},
-    {TOP_cmpnltss,          TOP_vfcmpss},
-    {TOP_cmpnless,          TOP_vfcmpss},
-    {TOP_cmpordss,          TOP_vfcmpss},
+    {TOP_cmpss,             TOP_vcmpss},
+    {TOP_cmpsd,             TOP_vcmpsd},
+    {TOP_cmpps,             TOP_vcmpps},
+    {TOP_cmppd,             TOP_vcmppd},
+    {TOP_cmpeqps,           TOP_vcmpeqps}, 
+    {TOP_cmpltps,           TOP_vcmpltps},
+    {TOP_cmpleps,           TOP_vcmpleps},
+    {TOP_cmpunordps,        TOP_vcmpunordps},
+    {TOP_cmpneqps,          TOP_vcmpneqps},
+    {TOP_cmpnltps,          TOP_vcmpnltps},
+    {TOP_cmpnleps,          TOP_vcmpnleps},
+    {TOP_cmpordps,          TOP_vcmpordps},
+    {TOP_cmpeqss,           TOP_vcmpeqss},
+    {TOP_cmpltss,           TOP_vcmpltss},
+    {TOP_cmpless,           TOP_vcmpless},
+    {TOP_cmpunordss,        TOP_vcmpunordss},
+    {TOP_cmpneqss,          TOP_vcmpneqss},
+    {TOP_cmpnltss,          TOP_vcmpnltss },
+    {TOP_cmpnless,          TOP_vcmpnless},
+    {TOP_cmpordss,          TOP_vcmpordss},
+    {TOP_cmpeqsd,           TOP_vcmpeqsd},
+    {TOP_cmpltsd,           TOP_vcmpltsd},
+    {TOP_cmplesd,           TOP_vcmplesd},
+    {TOP_cmpunordsd,        TOP_vcmpunordsd},
+    {TOP_cmpneqsd,          TOP_vcmpneqsd},
+    {TOP_cmpnltsd,          TOP_vcmpnltsd },
+    {TOP_cmpnlesd,          TOP_vcmpnlesd},
+    {TOP_cmpordsd,          TOP_vcmpordsd},
     {TOP_unpckhpd,          TOP_vunpckh128v64},
     {TOP_unpckhps,          TOP_vunpckh128v32},
     {TOP_unpcklpd,          TOP_vunpckl128v64},
     {TOP_unpcklps,          TOP_vunpckl128v32},
-    {TOP_punpcklbw,         TOP_vpunpckl64v8},
-    {TOP_punpcklwd,         TOP_vpunpckl64v16},
-    {TOP_punpckldq,         TOP_vpunpckl64v32},
-    {TOP_punpckhbw,         TOP_vpunpckh64v8},
-    {TOP_punpckhwd,         TOP_vpunpckh64v16},
-    {TOP_punpckhdq,         TOP_vpunpckh64v32},
-    {TOP_packsswb,          TOP_vpacksswb},
-    {TOP_packssdw,          TOP_vpackssdw},
-    {TOP_packuswb,          TOP_vpackuswb},
+    {TOP_punpcklbw128,      TOP_vpunpckl64v8},
+    {TOP_punpcklwd128,      TOP_vpunpckl64v16},
+    {TOP_punpckldq128,      TOP_vpunpckl64v32},
+    {TOP_punpckhbw128,      TOP_vpunpckh64v8},
+    {TOP_punpckhwd128,      TOP_vpunpckh64v16},
+    {TOP_punpckhdq128,      TOP_vpunpckh64v32},
+    {TOP_packsswb128,       TOP_vpacksswb},
+    {TOP_packssdw128,       TOP_vpackssdw},
+    {TOP_packuswb128,       TOP_vpackuswb},
     {TOP_pshufd,            TOP_vpshuf128v32},
-    // TOP_vpshufw and vpshufd not consistant in the AVX names
     {TOP_pshufw,            TOP_vpshufw64v16},
     {TOP_pshuflw,           TOP_vpshuflw},
     {TOP_pshufhw,           TOP_vpshufhw},
 <at>  <at>  -2187,12 +2188,11  <at>  <at> 
     {TOP_xzero128v32,       TOP_vxzero128v32},
     {TOP_xzero128v64,       TOP_vxzero128v64},
     {TOP_subus128v16,       TOP_vsubus128v16},
-    {TOP_pavgb,             TOP_vpavgb},
-    {TOP_pavgw,             TOP_vpavgw},
-    {TOP_psadbw,            TOP_vpsadbw},
+    {TOP_pavgb128,          TOP_vpavgb},
+    {TOP_pavgw128,          TOP_vpavgw},
+    {TOP_psadbw128,         TOP_vpsadbw},
     {TOP_storenti128,       TOP_vstorenti128},
     {TOP_storelpd,          TOP_vstorelpd},
-    {TOP_pshufw64v16,       TOP_vpshufw64v16},
     {TOP_pmovmskb128,       TOP_vpmovmskb128},
     // SSE 4.1
     {TOP_mpsadbw,            TOP_vmpsadbw},

Modified: trunk/osprey/be/cg/x8664/cgemit_targ.cxx
===================================================================
--- trunk/osprey/be/cg/x8664/cgemit_targ.cxx	2011-07-01 16:46:18 UTC (rev 3668)
+++ trunk/osprey/be/cg/x8664/cgemit_targ.cxx	2011-07-01 16:52:19 UTC (rev 3669)
 <at>  <at>  -1676,14 +1676,14  <at>  <at> 
   OP_Name[TOP_vfcmpx128v32] = "vcmpps";
   OP_Name[TOP_vfcmpxx128v32] = "vcmpps";
   OP_Name[TOP_vfcmpxxx128v32] = "vcmpps";
-  OP_Name[TOP_vfcmpsd] = "vcmpsd";
-  OP_Name[TOP_vfcmpxsd] = "vcmpsd";
-  OP_Name[TOP_vfcmpxxsd] = "vcmpsd";
-  OP_Name[TOP_vfcmpxxxsd] = "vcmpsd";
-  OP_Name[TOP_vfcmpss] = "vcmpss";
-  OP_Name[TOP_vfcmpxss] = "vcmpss";
-  OP_Name[TOP_vfcmpxxss] = "vcmpss";
-  OP_Name[TOP_vfcmpxxxss] = "vcmpss";
+  OP_Name[TOP_vcmpsd] = "vcmpsd";
+  OP_Name[TOP_vcmpxsd] = "vcmpsd";
+  OP_Name[TOP_vcmpxxsd] = "vcmpsd";
+  OP_Name[TOP_vcmpxxxsd] = "vcmpsd";
+  OP_Name[TOP_vcmpss] = "vcmpss";
+  OP_Name[TOP_vcmpxss] = "vcmpss";
+  OP_Name[TOP_vcmpxxss] = "vcmpss";
+  OP_Name[TOP_vcmpxxxss] = "vcmpss";
   OP_Name[TOP_vcomisd] = "vcomisd";
   OP_Name[TOP_vcomixsd] = "vcomisd";
   OP_Name[TOP_vcomixxsd] = "vcomisd";
 <at>  <at>  -2145,14 +2145,14  <at>  <at> 
   OP_Name[TOP_vandnx128v64] = "vpandn";
   OP_Name[TOP_vandnxx128v64] = "vpandn";
   OP_Name[TOP_vandnxxx128v64] = "vpandn";
-  OP_Name[TOP_vpavgb] = "pavgb";
-  OP_Name[TOP_vpavgbx] = "pavgb";
-  OP_Name[TOP_vpavgbxx] = "pavgb";
-  OP_Name[TOP_vpavgbxxx] = "pavgb";
-  OP_Name[TOP_vpavgw] = "pavgw";
-  OP_Name[TOP_vpavgwx] = "pavgw";
-  OP_Name[TOP_vpavgwxx] = "pavgw";
-  OP_Name[TOP_vpavgwxxx] = "pavgw";
+  OP_Name[TOP_vpavgb] = "vpavgb";
+  OP_Name[TOP_vpavgbx] = "vpavgb";
+  OP_Name[TOP_vpavgbxx] = "vpavgb";
+  OP_Name[TOP_vpavgbxxx] = "vpavgb";
+  OP_Name[TOP_vpavgw] = "vpavgw";
+  OP_Name[TOP_vpavgwx] = "vpavgw";
+  OP_Name[TOP_vpavgwxx] = "vpavgw";
+  OP_Name[TOP_vpavgwxxx] = "vpavgw";
   OP_Name[TOP_vblendv128v8] = "vpblendvb";
   OP_Name[TOP_vblendvx128v8] = "vpblendvb";
   OP_Name[TOP_vblendvxx128v8] = "vpblendvb";

Modified: trunk/osprey/be/cg/x8664/expand.cxx
===================================================================
--- trunk/osprey/be/cg/x8664/expand.cxx	2011-07-01 16:46:18 UTC (rev 3668)
+++ trunk/osprey/be/cg/x8664/expand.cxx	2011-07-01 16:52:19 UTC (rev 3669)
 <at>  <at>  -8196,7 +8196,13  <at>  <at> 
       Build_OP( TOP_movx2g, tmp0, op0, ops );
       op0 = tmp0;
     }
-    Build_OP( TOP_cvtsi2ss, result, op0, ops );
+    if (Is_Target_Orochi() && Is_Target_AVX()) {
+      TN *xzero = Build_TN_Like(result);
+      Build_OP( TOP_xzero128v32, xzero, ops );
+      Build_OP( TOP_cvtsi2ss, result, xzero, op0, ops );
+    } else {
+      Build_OP( TOP_cvtsi2ss, result, op0, ops );
+    }
     break;
   case INTRN_CVTSI642SS:
     if (TN_register_class(op0) != ISA_REGISTER_CLASS_integer) {
 <at>  <at>  -8204,7 +8210,13  <at>  <at> 
       Build_OP( TOP_movx2g64, tmp0, op0, ops );
       op0 = tmp0;
     }
-    Build_OP( TOP_cvtsi2ssq, result, op0, ops );
+    if (Is_Target_Orochi() && Is_Target_AVX()) {
+      TN *xzero = Build_TN_Like(result);
+      Build_OP( TOP_xzero128v32, xzero, ops );
+      Build_OP( TOP_cvtsi2ssq, result, xzero, op0, ops );
+    } else {
+      Build_OP( TOP_cvtsi2ssq, result, op0, ops );
+    }
     break;
   case INTRN_CVTSS2SI:
     Build_OP( TOP_cvtss2si, result, op0, ops );
 <at>  <at>  -8283,10 +8295,22  <at>  <at> 
     Build_OP( TOP_cvtps2pd, result, op0, ops );
     break;
   case INTRN_CVTSD2SS:
-    Build_OP( TOP_cvtsd2ss, result, op0, ops );
+    if (Is_Target_Orochi() && Is_Target_AVX()) {
+      TN *xzero = Build_TN_Like(result);
+      Build_OP( TOP_xzero128v32, xzero, ops );
+      Build_OP( TOP_cvtsd2ss, result, xzero, op0, ops );
+    } else {
+      Build_OP( TOP_cvtsd2ss, result, op0, ops );
+    }
     break;
   case INTRN_CVTSS2SD:
-    Build_OP( TOP_cvtss2sd, result, op0, ops );
+    if (Is_Target_Orochi() && Is_Target_AVX()) {
+      TN *xzero = Build_TN_Like(result);
+      Build_OP( TOP_xzero128v32, xzero, ops );
+      Build_OP( TOP_cvtss2sd, result, xzero, op0, ops );
+    } else {
+      Build_OP( TOP_cvtss2sd, result, op0, ops );
+    }
     break;
   case INTRN_LOADUPS:
     Build_OP( TOP_ldups, result, op0, Gen_Literal_TN (0,4), ops );
 <at>  <at>  -8590,10 +8614,10  <at>  <at> 
     Build_OP(TOP_vfcmp128v32, result, op0, op1, op2, ops );
     break;
    case INTRN_CMPSD:
-    Build_OP(TOP_vfcmpsd, result, op0, op1, op2, ops );
+    Build_OP(TOP_vcmpsd, result, op0, op1, op2, ops );
     break;
    case INTRN_CMPSS:
-    Build_OP(TOP_vfcmpss, result, op0, op1, op2, ops );
+    Build_OP(TOP_vcmpss, result, op0, op1, op2, ops );
     break;
    case INTRN_CVTDQ2PD256:
     Build_OP(TOP_vcvtdq2pd, result, op0, ops );

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
Mathew, Pallavi | 1 Jul 19:54 2011
Picon

Code review request for danglig scalar references bug [LNO]

Hi,

Can a gatekeeper please review the attached patch that fixes an issue with dead scalar references?

 

Testcase:

extern double sqrt(double);

int global_arr[4];

void SetPoints()

{

  int i, r0;

  for (i=0; i<10; i++) {

    r0 = global_arr[3];

    sqrt(r0 * r0);

  }

}

 

Compiling with 'opencc -O3 -apo' gives:

### Assertion failure at line 710 of .../../../osprey/be/lno/access_vector.h:

### Compiler Error in file sample.c during Loop Nest Optimizer phase:

### SYMBOL::Init(WN*) called with opcode 66612

opencc INTERNAL ERROR: .../x86_64-open64-linux/4.2/be returned non-zero status 1

 

Problem/Fix Description:

The error is caused due to a dangling pointer leftover from an incomplete

cleanup after array substitution removes a whirl node. The error shows up

only for calls to builtin math function 'sqrt' (not sure if there are more

such functions) where the original call is replaced by a builtin and a new

call is placed depending on the accuracy of the result. The problem will not

show with '-fno-math-errno'.

The fix adds a cleanup function to remove scalar references to WHIRL nodes

marked for deletion.

 

The loop in sample.c:

  for (i=0; i<10; i++) {

    r0 = global_arr[3]; //L1

    sqrt(r0 * r0); //L2

  }

 

The CALL_INFO corresponding to L2 has 'r0' as a scalar_use.

Array substitution replaces 'r0' with 'global_arr[3] and deletes whirl node for 'r0'.

But the reference to 'r0' still exists within call_info of 'sqrt', which

results in an inconsistency (assertion failure) in array_loop_info that is detected

when merging a call's scalar uses as part auto parallelization.

 

Note that had the call to 'sqrt' been a VCALL it would not go through array substitution.

'sqrt' being a builtin, the VCALL is replaced (in the absence of -fno-math-errno) by

F8SQRT, result saved in temp_var and depending on whether or not the result is a NAN,

a call is placed to F8CALL_SQRT. (See trace file with -Wb,-tra).

-apo is needed to expose error because Annotations of Call

(NSE_Annotate_Scalar_Call called by IPA_LNO_Map_Calls) only occurs under autopar.

 

The patch to function 'DeleteElement' in  osprey/be/com/cxx_template.h is to avoid

a compile error (during compiler build) caused by absence of a suitable assignment

operator for SCALAR_NODE and SCALAR_REF. This is issue was not exposed earlier because

existing calls to 'DeleteTop' were only on stacks of pointers. An alternate patch would

be to define dummy assignment operators like the one shown below, although I prefer the

patch shown in the attachment.

 

Index: osprey/be/com/dep_graph.h

===================================================================

--- osprey/be/com/dep_graph.h              (revision 1418)

+++ osprey/be/com/dep_graph.h           (working copy)

<at> <at> -815,6 +815,11 <at> <at>

     Statement_Number = scalar_ref.Statement_Number;

     return *this;

   }

+  SCALAR_REF& operator=(void*) {

+    Wn =NULL;

+    Statement_Number = 0;

+    return *this;

+  }

 };

 

 // all the references to a particular SYMBOL

<at> <at> -830,6 +835,10 <at> <at>

     _scalar_ref_stack = CXX_NEW(SCALAR_REF_STACK(pool),pool);

     _scalar = scalar;

   }

+  SCALAR_NODE& operator=(void*) {

+    _scalar_ref_stack = NULL;

+    return *this;

+  }

   INT Elements() const { return _scalar_ref_stack->Elements(); };

   SCALAR_REF *Bottom_nth(INT i) { return &_scalar_ref_stack->Bottom_nth(i); };

   SCALAR_REF *Top_nth(INT i) { return &_scalar_ref_stack->Top_nth(i); };

 

Thanks.

Pallavi

Attachment (scalar_ref.p): application/octet-stream, 7214 bytes
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Open64-devel mailing list
Open64-devel <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel

Gmane