diff options
243 files changed, 0 insertions, 51156 deletions
diff --git a/COPYING b/COPYING deleted file mode 100644 index 623b625..0000000 --- a/COPYING +++ /dev/null @@ -1,340 +0,0 @@ - GNU GENERAL PUBLIC LICENSE - Version 2, June 1991 - - Copyright (C) 1989, 1991 Free Software Foundation, Inc. - 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA - Everyone is permitted to copy and distribute verbatim copies - of this license document, but changing it is not allowed. - - Preamble - - The licenses for most software are designed to take away your -freedom to share and change it. By contrast, the GNU General Public -License is intended to guarantee your freedom to share and change free -software--to make sure the software is free for all its users. This -General Public License applies to most of the Free Software -Foundation's software and to any other program whose authors commit to -using it. (Some other Free Software Foundation software is covered by -the GNU Library General Public License instead.) You can apply it to -your programs, too. - - When we speak of free software, we are referring to freedom, not -price. Our General Public Licenses are designed to make sure that you -have the freedom to distribute copies of free software (and charge for -this service if you wish), that you receive source code or can get it -if you want it, that you can change the software or use pieces of it -in new free programs; and that you know you can do these things. - - To protect your rights, we need to make restrictions that forbid -anyone to deny you these rights or to ask you to surrender the rights. -These restrictions translate to certain responsibilities for you if you -distribute copies of the software, or if you modify it. - - For example, if you distribute copies of such a program, whether -gratis or for a fee, you must give the recipients all the rights that -you have. You must make sure that they, too, receive or can get the -source code. And you must show them these terms so they know their -rights. - - We protect your rights with two steps: (1) copyright the software, and -(2) offer you this license which gives you legal permission to copy, -distribute and/or modify the software. - - Also, for each author's protection and ours, we want to make certain -that everyone understands that there is no warranty for this free -software. If the software is modified by someone else and passed on, we -want its recipients to know that what they have is not the original, so -that any problems introduced by others will not reflect on the original -authors' reputations. - - Finally, any free program is threatened constantly by software -patents. We wish to avoid the danger that redistributors of a free -program will individually obtain patent licenses, in effect making the -program proprietary. To prevent this, we have made it clear that any -patent must be licensed for everyone's free use or not licensed at all. - - The precise terms and conditions for copying, distribution and -modification follow. - - GNU GENERAL PUBLIC LICENSE - TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION - - 0. This License applies to any program or other work which contains -a notice placed by the copyright holder saying it may be distributed -under the terms of this General Public License. The "Program", below, -refers to any such program or work, and a "work based on the Program" -means either the Program or any derivative work under copyright law: -that is to say, a work containing the Program or a portion of it, -either verbatim or with modifications and/or translated into another -language. (Hereinafter, translation is included without limitation in -the term "modification".) Each licensee is addressed as "you". - -Activities other than copying, distribution and modification are not -covered by this License; they are outside its scope. The act of -running the Program is not restricted, and the output from the Program -is covered only if its contents constitute a work based on the -Program (independent of having been made by running the Program). -Whether that is true depends on what the Program does. - - 1. You may copy and distribute verbatim copies of the Program's -source code as you receive it, in any medium, provided that you -conspicuously and appropriately publish on each copy an appropriate -copyright notice and disclaimer of warranty; keep intact all the -notices that refer to this License and to the absence of any warranty; -and give any other recipients of the Program a copy of this License -along with the Program. - -You may charge a fee for the physical act of transferring a copy, and -you may at your option offer warranty protection in exchange for a fee. - - 2. You may modify your copy or copies of the Program or any portion -of it, thus forming a work based on the Program, and copy and -distribute such modifications or work under the terms of Section 1 -above, provided that you also meet all of these conditions: - - a) You must cause the modified files to carry prominent notices - stating that you changed the files and the date of any change. - - b) You must cause any work that you distribute or publish, that in - whole or in part contains or is derived from the Program or any - part thereof, to be licensed as a whole at no charge to all third - parties under the terms of this License. - - c) If the modified program normally reads commands interactively - when run, you must cause it, when started running for such - interactive use in the most ordinary way, to print or display an - announcement including an appropriate copyright notice and a - notice that there is no warranty (or else, saying that you provide - a warranty) and that users may redistribute the program under - these conditions, and telling the user how to view a copy of this - License. (Exception: if the Program itself is interactive but - does not normally print such an announcement, your work based on - the Program is not required to print an announcement.) - -These requirements apply to the modified work as a whole. If -identifiable sections of that work are not derived from the Program, -and can be reasonably considered independent and separate works in -themselves, then this License, and its terms, do not apply to those -sections when you distribute them as separate works. But when you -distribute the same sections as part of a whole which is a work based -on the Program, the distribution of the whole must be on the terms of -this License, whose permissions for other licensees extend to the -entire whole, and thus to each and every part regardless of who wrote it. - -Thus, it is not the intent of this section to claim rights or contest -your rights to work written entirely by you; rather, the intent is to -exercise the right to control the distribution of derivative or -collective works based on the Program. - -In addition, mere aggregation of another work not based on the Program -with the Program (or with a work based on the Program) on a volume of -a storage or distribution medium does not bring the other work under -the scope of this License. - - 3. You may copy and distribute the Program (or a work based on it, -under Section 2) in object code or executable form under the terms of -Sections 1 and 2 above provided that you also do one of the following: - - a) Accompany it with the complete corresponding machine-readable - source code, which must be distributed under the terms of Sections - 1 and 2 above on a medium customarily used for software interchange; or, - - b) Accompany it with a written offer, valid for at least three - years, to give any third party, for a charge no more than your - cost of physically performing source distribution, a complete - machine-readable copy of the corresponding source code, to be - distributed under the terms of Sections 1 and 2 above on a medium - customarily used for software interchange; or, - - c) Accompany it with the information you received as to the offer - to distribute corresponding source code. (This alternative is - allowed only for noncommercial distribution and only if you - received the program in object code or executable form with such - an offer, in accord with Subsection b above.) - -The source code for a work means the preferred form of the work for -making modifications to it. For an executable work, complete source -code means all the source code for all modules it contains, plus any -associated interface definition files, plus the scripts used to -control compilation and installation of the executable. However, as a -special exception, the source code distributed need not include -anything that is normally distributed (in either source or binary -form) with the major components (compiler, kernel, and so on) of the -operating system on which the executable runs, unless that component -itself accompanies the executable. - -If distribution of executable or object code is made by offering -access to copy from a designated place, then offering equivalent -access to copy the source code from the same place counts as -distribution of the source code, even though third parties are not -compelled to copy the source along with the object code. - - 4. You may not copy, modify, sublicense, or distribute the Program -except as expressly provided under this License. Any attempt -otherwise to copy, modify, sublicense or distribute the Program is -void, and will automatically terminate your rights under this License. -However, parties who have received copies, or rights, from you under -this License will not have their licenses terminated so long as such -parties remain in full compliance. - - 5. You are not required to accept this License, since you have not -signed it. However, nothing else grants you permission to modify or -distribute the Program or its derivative works. These actions are -prohibited by law if you do not accept this License. Therefore, by -modifying or distributing the Program (or any work based on the -Program), you indicate your acceptance of this License to do so, and -all its terms and conditions for copying, distributing or modifying -the Program or works based on it. - - 6. Each time you redistribute the Program (or any work based on the -Program), the recipient automatically receives a license from the -original licensor to copy, distribute or modify the Program subject to -these terms and conditions. You may not impose any further -restrictions on the recipients' exercise of the rights granted herein. -You are not responsible for enforcing compliance by third parties to -this License. - - 7. If, as a consequence of a court judgment or allegation of patent -infringement or for any other reason (not limited to patent issues), -conditions are imposed on you (whether by court order, agreement or -otherwise) that contradict the conditions of this License, they do not -excuse you from the conditions of this License. If you cannot -distribute so as to satisfy simultaneously your obligations under this -License and any other pertinent obligations, then as a consequence you -may not distribute the Program at all. For example, if a patent -license would not permit royalty-free redistribution of the Program by -all those who receive copies directly or indirectly through you, then -the only way you could satisfy both it and this License would be to -refrain entirely from distribution of the Program. - -If any portion of this section is held invalid or unenforceable under -any particular circumstance, the balance of the section is intended to -apply and the section as a whole is intended to apply in other -circumstances. - -It is not the purpose of this section to induce you to infringe any -patents or other property right claims or to contest validity of any -such claims; this section has the sole purpose of protecting the -integrity of the free software distribution system, which is -implemented by public license practices. Many people have made -generous contributions to the wide range of software distributed -through that system in reliance on consistent application of that -system; it is up to the author/donor to decide if he or she is willing -to distribute software through any other system and a licensee cannot -impose that choice. - -This section is intended to make thoroughly clear what is believed to -be a consequence of the rest of this License. - - 8. If the distribution and/or use of the Program is restricted in -certain countries either by patents or by copyrighted interfaces, the -original copyright holder who places the Program under this License -may add an explicit geographical distribution limitation excluding -those countries, so that distribution is permitted only in or among -countries not thus excluded. In such case, this License incorporates -the limitation as if written in the body of this License. - - 9. The Free Software Foundation may publish revised and/or new versions -of the General Public License from time to time. Such new versions will -be similar in spirit to the present version, but may differ in detail to -address new problems or concerns. - -Each version is given a distinguishing version number. If the Program -specifies a version number of this License which applies to it and "any -later version", you have the option of following the terms and conditions -either of that version or of any later version published by the Free -Software Foundation. If the Program does not specify a version number of -this License, you may choose any version ever published by the Free Software -Foundation. - - 10. If you wish to incorporate parts of the Program into other free -programs whose distribution conditions are different, write to the author -to ask for permission. For software which is copyrighted by the Free -Software Foundation, write to the Free Software Foundation; we sometimes -make exceptions for this. Our decision will be guided by the two goals -of preserving the free status of all derivatives of our free software and -of promoting the sharing and reuse of software generally. - - NO WARRANTY - - 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY -FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN -OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES -PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED -OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF -MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS -TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE -PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, -REPAIR OR CORRECTION. - - 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING -WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR -REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, -INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING -OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED -TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY -YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER -PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE -POSSIBILITY OF SUCH DAMAGES. - - END OF TERMS AND CONDITIONS - - How to Apply These Terms to Your New Programs - - If you develop a new program, and you want it to be of the greatest -possible use to the public, the best way to achieve this is to make it -free software which everyone can redistribute and change under these terms. - - To do so, attach the following notices to the program. It is safest -to attach them to the start of each source file to most effectively -convey the exclusion of warranty; and each file should have at least -the "copyright" line and a pointer to where the full notice is found. - - <one line to give the program's name and a brief idea of what it does.> - Copyright (C) <year> <name of author> - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software - Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA - - -Also add information on how to contact you by electronic and paper mail. - -If the program is interactive, make it output a short notice like this -when it starts in an interactive mode: - - Gnomovision version 69, Copyright (C) year name of author - Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. - This is free software, and you are welcome to redistribute it - under certain conditions; type `show c' for details. - -The hypothetical commands `show w' and `show c' should show the appropriate -parts of the General Public License. Of course, the commands you use may -be called something other than `show w' and `show c'; they could even be -mouse-clicks or menu items--whatever suits your program. - -You should also get your employer (if you work as a programmer) or your -school, if any, to sign a "copyright disclaimer" for the program, if -necessary. Here is a sample; alter the names: - - Yoyodyne, Inc., hereby disclaims all copyright interest in the program - `Gnomovision' (which makes passes at compilers) written by James Hacker. - - <signature of Ty Coon>, 1 April 1989 - Ty Coon, President of Vice - -This General Public License does not permit incorporating your program into -proprietary programs. If your program is a subroutine library, you may -consider it more useful to permit linking proprietary applications with the -library. If this is what you want to do, use the GNU Library General -Public License instead of this License. diff --git a/COPYING.LIB b/COPYING.LIB deleted file mode 100644 index 2d2d780..0000000 --- a/COPYING.LIB +++ /dev/null @@ -1,510 +0,0 @@ - - GNU LESSER GENERAL PUBLIC LICENSE - Version 2.1, February 1999 - - Copyright (C) 1991, 1999 Free Software Foundation, Inc. - 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA - Everyone is permitted to copy and distribute verbatim copies - of this license document, but changing it is not allowed. - -[This is the first released version of the Lesser GPL. It also counts - as the successor of the GNU Library Public License, version 2, hence - the version number 2.1.] - - Preamble - - The licenses for most software are designed to take away your -freedom to share and change it. By contrast, the GNU General Public -Licenses are intended to guarantee your freedom to share and change -free software--to make sure the software is free for all its users. - - This license, the Lesser General Public License, applies to some -specially designated software packages--typically libraries--of the -Free Software Foundation and other authors who decide to use it. You -can use it too, but we suggest you first think carefully about whether -this license or the ordinary General Public License is the better -strategy to use in any particular case, based on the explanations -below. - - When we speak of free software, we are referring to freedom of use, -not price. Our General Public Licenses are designed to make sure that -you have the freedom to distribute copies of free software (and charge -for this service if you wish); that you receive source code or can get -it if you want it; that you can change the software and use pieces of -it in new free programs; and that you are informed that you can do -these things. - - To protect your rights, we need to make restrictions that forbid -distributors to deny you these rights or to ask you to surrender these -rights. These restrictions translate to certain responsibilities for -you if you distribute copies of the library or if you modify it. - - For example, if you distribute copies of the library, whether gratis -or for a fee, you must give the recipients all the rights that we gave -you. You must make sure that they, too, receive or can get the source -code. If you link other code with the library, you must provide -complete object files to the recipients, so that they can relink them -with the library after making changes to the library and recompiling -it. And you must show them these terms so they know their rights. - - We protect your rights with a two-step method: (1) we copyright the -library, and (2) we offer you this license, which gives you legal -permission to copy, distribute and/or modify the library. - - To protect each distributor, we want to make it very clear that -there is no warranty for the free library. Also, if the library is -modified by someone else and passed on, the recipients should know -that what they have is not the original version, so that the original -author's reputation will not be affected by problems that might be -introduced by others. - - Finally, software patents pose a constant threat to the existence of -any free program. We wish to make sure that a company cannot -effectively restrict the users of a free program by obtaining a -restrictive license from a patent holder. Therefore, we insist that -any patent license obtained for a version of the library must be -consistent with the full freedom of use specified in this license. - - Most GNU software, including some libraries, is covered by the -ordinary GNU General Public License. This license, the GNU Lesser -General Public License, applies to certain designated libraries, and -is quite different from the ordinary General Public License. We use -this license for certain libraries in order to permit linking those -libraries into non-free programs. - - When a program is linked with a library, whether statically or using -a shared library, the combination of the two is legally speaking a -combined work, a derivative of the original library. The ordinary -General Public License therefore permits such linking only if the -entire combination fits its criteria of freedom. The Lesser General -Public License permits more lax criteria for linking other code with -the library. - - We call this license the "Lesser" General Public License because it -does Less to protect the user's freedom than the ordinary General -Public License. It also provides other free software developers Less -of an advantage over competing non-free programs. These disadvantages -are the reason we use the ordinary General Public License for many -libraries. However, the Lesser license provides advantages in certain -special circumstances. - - For example, on rare occasions, there may be a special need to -encourage the widest possible use of a certain library, so that it -becomes a de-facto standard. To achieve this, non-free programs must -be allowed to use the library. A more frequent case is that a free -library does the same job as widely used non-free libraries. In this -case, there is little to gain by limiting the free library to free -software only, so we use the Lesser General Public License. - - In other cases, permission to use a particular library in non-free -programs enables a greater number of people to use a large body of -free software. For example, permission to use the GNU C Library in -non-free programs enables many more people to use the whole GNU -operating system, as well as its variant, the GNU/Linux operating -system. - - Although the Lesser General Public License is Less protective of the -users' freedom, it does ensure that the user of a program that is -linked with the Library has the freedom and the wherewithal to run -that program using a modified version of the Library. - - The precise terms and conditions for copying, distribution and -modification follow. Pay close attention to the difference between a -"work based on the library" and a "work that uses the library". The -former contains code derived from the library, whereas the latter must -be combined with the library in order to run. - - GNU LESSER GENERAL PUBLIC LICENSE - TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION - - 0. This License Agreement applies to any software library or other -program which contains a notice placed by the copyright holder or -other authorized party saying it may be distributed under the terms of -this Lesser General Public License (also called "this License"). -Each licensee is addressed as "you". - - A "library" means a collection of software functions and/or data -prepared so as to be conveniently linked with application programs -(which use some of those functions and data) to form executables. - - The "Library", below, refers to any such software library or work -which has been distributed under these terms. A "work based on the -Library" means either the Library or any derivative work under -copyright law: that is to say, a work containing the Library or a -portion of it, either verbatim or with modifications and/or translated -straightforwardly into another language. (Hereinafter, translation is -included without limitation in the term "modification".) - - "Source code" for a work means the preferred form of the work for -making modifications to it. For a library, complete source code means -all the source code for all modules it contains, plus any associated -interface definition files, plus the scripts used to control -compilation and installation of the library. - - Activities other than copying, distribution and modification are not -covered by this License; they are outside its scope. The act of -running a program using the Library is not restricted, and output from -such a program is covered only if its contents constitute a work based -on the Library (independent of the use of the Library in a tool for -writing it). Whether that is true depends on what the Library does -and what the program that uses the Library does. - - 1. You may copy and distribute verbatim copies of the Library's -complete source code as you receive it, in any medium, provided that -you conspicuously and appropriately publish on each copy an -appropriate copyright notice and disclaimer of warranty; keep intact -all the notices that refer to this License and to the absence of any -warranty; and distribute a copy of this License along with the -Library. - - You may charge a fee for the physical act of transferring a copy, -and you may at your option offer warranty protection in exchange for a -fee. - - 2. You may modify your copy or copies of the Library or any portion -of it, thus forming a work based on the Library, and copy and -distribute such modifications or work under the terms of Section 1 -above, provided that you also meet all of these conditions: - - a) The modified work must itself be a software library. - - b) You must cause the files modified to carry prominent notices - stating that you changed the files and the date of any change. - - c) You must cause the whole of the work to be licensed at no - charge to all third parties under the terms of this License. - - d) If a facility in the modified Library refers to a function or a - table of data to be supplied by an application program that uses - the facility, other than as an argument passed when the facility - is invoked, then you must make a good faith effort to ensure that, - in the event an application does not supply such function or - table, the facility still operates, and performs whatever part of - its purpose remains meaningful. - - (For example, a function in a library to compute square roots has - a purpose that is entirely well-defined independent of the - application. Therefore, Subsection 2d requires that any - application-supplied function or table used by this function must - be optional: if the application does not supply it, the square - root function must still compute square roots.) - -These requirements apply to the modified work as a whole. If -identifiable sections of that work are not derived from the Library, -and can be reasonably considered independent and separate works in -themselves, then this License, and its terms, do not apply to those -sections when you distribute them as separate works. But when you -distribute the same sections as part of a whole which is a work based -on the Library, the distribution of the whole must be on the terms of -this License, whose permissions for other licensees extend to the -entire whole, and thus to each and every part regardless of who wrote -it. - -Thus, it is not the intent of this section to claim rights or contest -your rights to work written entirely by you; rather, the intent is to -exercise the right to control the distribution of derivative or -collective works based on the Library. - -In addition, mere aggregation of another work not based on the Library -with the Library (or with a work based on the Library) on a volume of -a storage or distribution medium does not bring the other work under -the scope of this License. - - 3. You may opt to apply the terms of the ordinary GNU General Public -License instead of this License to a given copy of the Library. To do -this, you must alter all the notices that refer to this License, so -that they refer to the ordinary GNU General Public License, version 2, -instead of to this License. (If a newer version than version 2 of the -ordinary GNU General Public License has appeared, then you can specify -that version instead if you wish.) Do not make any other change in -these notices. - - Once this change is made in a given copy, it is irreversible for -that copy, so the ordinary GNU General Public License applies to all -subsequent copies and derivative works made from that copy. - - This option is useful when you wish to copy part of the code of -the Library into a program that is not a library. - - 4. You may copy and distribute the Library (or a portion or -derivative of it, under Section 2) in object code or executable form -under the terms of Sections 1 and 2 above provided that you accompany -it with the complete corresponding machine-readable source code, which -must be distributed under the terms of Sections 1 and 2 above on a -medium customarily used for software interchange. - - If distribution of object code is made by offering access to copy -from a designated place, then offering equivalent access to copy the -source code from the same place satisfies the requirement to -distribute the source code, even though third parties are not -compelled to copy the source along with the object code. - - 5. A program that contains no derivative of any portion of the -Library, but is designed to work with the Library by being compiled or -linked with it, is called a "work that uses the Library". Such a -work, in isolation, is not a derivative work of the Library, and -therefore falls outside the scope of this License. - - However, linking a "work that uses the Library" with the Library -creates an executable that is a derivative of the Library (because it -contains portions of the Library), rather than a "work that uses the -library". The executable is therefore covered by this License. -Section 6 states terms for distribution of such executables. - - When a "work that uses the Library" uses material from a header file -that is part of the Library, the object code for the work may be a -derivative work of the Library even though the source code is not. -Whether this is true is especially significant if the work can be -linked without the Library, or if the work is itself a library. The -threshold for this to be true is not precisely defined by law. - - If such an object file uses only numerical parameters, data -structure layouts and accessors, and small macros and small inline -functions (ten lines or less in length), then the use of the object -file is unrestricted, regardless of whether it is legally a derivative -work. (Executables containing this object code plus portions of the -Library will still fall under Section 6.) - - Otherwise, if the work is a derivative of the Library, you may -distribute the object code for the work under the terms of Section 6. -Any executables containing that work also fall under Section 6, -whether or not they are linked directly with the Library itself. - - 6. As an exception to the Sections above, you may also combine or -link a "work that uses the Library" with the Library to produce a -work containing portions of the Library, and distribute that work -under terms of your choice, provided that the terms permit -modification of the work for the customer's own use and reverse -engineering for debugging such modifications. - - You must give prominent notice with each copy of the work that the -Library is used in it and that the Library and its use are covered by -this License. You must supply a copy of this License. If the work -during execution displays copyright notices, you must include the -copyright notice for the Library among them, as well as a reference -directing the user to the copy of this License. Also, you must do one -of these things: - - a) Accompany the work with the complete corresponding - machine-readable source code for the Library including whatever - changes were used in the work (which must be distributed under - Sections 1 and 2 above); and, if the work is an executable linked - with the Library, with the complete machine-readable "work that - uses the Library", as object code and/or source code, so that the - user can modify the Library and then relink to produce a modified - executable containing the modified Library. (It is understood - that the user who changes the contents of definitions files in the - Library will not necessarily be able to recompile the application - to use the modified definitions.) - - b) Use a suitable shared library mechanism for linking with the - Library. A suitable mechanism is one that (1) uses at run time a - copy of the library already present on the user's computer system, - rather than copying library functions into the executable, and (2) - will operate properly with a modified version of the library, if - the user installs one, as long as the modified version is - interface-compatible with the version that the work was made with. - - c) Accompany the work with a written offer, valid for at least - three years, to give the same user the materials specified in - Subsection 6a, above, for a charge no more than the cost of - performing this distribution. - - d) If distribution of the work is made by offering access to copy - from a designated place, offer equivalent access to copy the above - specified materials from the same place. - - e) Verify that the user has already received a copy of these - materials or that you have already sent this user a copy. - - For an executable, the required form of the "work that uses the -Library" must include any data and utility programs needed for -reproducing the executable from it. However, as a special exception, -the materials to be distributed need not include anything that is -normally distributed (in either source or binary form) with the major -components (compiler, kernel, and so on) of the operating system on -which the executable runs, unless that component itself accompanies -the executable. - - It may happen that this requirement contradicts the license -restrictions of other proprietary libraries that do not normally -accompany the operating system. Such a contradiction means you cannot -use both them and the Library together in an executable that you -distribute. - - 7. You may place library facilities that are a work based on the -Library side-by-side in a single library together with other library -facilities not covered by this License, and distribute such a combined -library, provided that the separate distribution of the work based on -the Library and of the other library facilities is otherwise -permitted, and provided that you do these two things: - - a) Accompany the combined library with a copy of the same work - based on the Library, uncombined with any other library - facilities. This must be distributed under the terms of the - Sections above. - - b) Give prominent notice with the combined library of the fact - that part of it is a work based on the Library, and explaining - where to find the accompanying uncombined form of the same work. - - 8. You may not copy, modify, sublicense, link with, or distribute -the Library except as expressly provided under this License. Any -attempt otherwise to copy, modify, sublicense, link with, or -distribute the Library is void, and will automatically terminate your -rights under this License. However, parties who have received copies, -or rights, from you under this License will not have their licenses -terminated so long as such parties remain in full compliance. - - 9. You are not required to accept this License, since you have not -signed it. However, nothing else grants you permission to modify or -distribute the Library or its derivative works. These actions are -prohibited by law if you do not accept this License. Therefore, by -modifying or distributing the Library (or any work based on the -Library), you indicate your acceptance of this License to do so, and -all its terms and conditions for copying, distributing or modifying -the Library or works based on it. - - 10. Each time you redistribute the Library (or any work based on the -Library), the recipient automatically receives a license from the -original licensor to copy, distribute, link with or modify the Library -subject to these terms and conditions. You may not impose any further -restrictions on the recipients' exercise of the rights granted herein. -You are not responsible for enforcing compliance by third parties with -this License. - - 11. If, as a consequence of a court judgment or allegation of patent -infringement or for any other reason (not limited to patent issues), -conditions are imposed on you (whether by court order, agreement or -otherwise) that contradict the conditions of this License, they do not -excuse you from the conditions of this License. If you cannot -distribute so as to satisfy simultaneously your obligations under this -License and any other pertinent obligations, then as a consequence you -may not distribute the Library at all. For example, if a patent -license would not permit royalty-free redistribution of the Library by -all those who receive copies directly or indirectly through you, then -the only way you could satisfy both it and this License would be to -refrain entirely from distribution of the Library. - -If any portion of this section is held invalid or unenforceable under -any particular circumstance, the balance of the section is intended to -apply, and the section as a whole is intended to apply in other -circumstances. - -It is not the purpose of this section to induce you to infringe any -patents or other property right claims or to contest validity of any -such claims; this section has the sole purpose of protecting the -integrity of the free software distribution system which is -implemented by public license practices. Many people have made -generous contributions to the wide range of software distributed -through that system in reliance on consistent application of that -system; it is up to the author/donor to decide if he or she is willing -to distribute software through any other system and a licensee cannot -impose that choice. - -This section is intended to make thoroughly clear what is believed to -be a consequence of the rest of this License. - - 12. If the distribution and/or use of the Library is restricted in -certain countries either by patents or by copyrighted interfaces, the -original copyright holder who places the Library under this License -may add an explicit geographical distribution limitation excluding those -countries, so that distribution is permitted only in or among -countries not thus excluded. In such case, this License incorporates -the limitation as if written in the body of this License. - - 13. The Free Software Foundation may publish revised and/or new -versions of the Lesser General Public License from time to time. -Such new versions will be similar in spirit to the present version, -but may differ in detail to address new problems or concerns. - -Each version is given a distinguishing version number. If the Library -specifies a version number of this License which applies to it and -"any later version", you have the option of following the terms and -conditions either of that version or of any later version published by -the Free Software Foundation. If the Library does not specify a -license version number, you may choose any version ever published by -the Free Software Foundation. - - 14. If you wish to incorporate parts of the Library into other free -programs whose distribution conditions are incompatible with these, -write to the author to ask for permission. For software which is -copyrighted by the Free Software Foundation, write to the Free -Software Foundation; we sometimes make exceptions for this. Our -decision will be guided by the two goals of preserving the free status -of all derivatives of our free software and of promoting the sharing -and reuse of software generally. - - NO WARRANTY - - 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO -WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. -EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR -OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY -KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR -PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE -LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME -THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. - - 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN -WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY -AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU -FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR -CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE -LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING -RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A -FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF -SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH -DAMAGES. - - END OF TERMS AND CONDITIONS - - How to Apply These Terms to Your New Libraries - - If you develop a new library, and you want it to be of the greatest -possible use to the public, we recommend making it free software that -everyone can redistribute and change. You can do so by permitting -redistribution under these terms (or, alternatively, under the terms -of the ordinary General Public License). - - To apply these terms, attach the following notices to the library. -It is safest to attach them to the start of each source file to most -effectively convey the exclusion of warranty; and each file should -have at least the "copyright" line and a pointer to where the full -notice is found. - - - <one line to give the library's name and a brief idea of what it does.> - Copyright (C) <year> <name of author> - - This library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with this library; if not, write to the Free Software - Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA - -Also add information on how to contact you by electronic and paper mail. - -You should also get your employer (if you work as a programmer) or -your school, if any, to sign a "copyright disclaimer" for the library, -if necessary. Here is a sample; alter the names: - - Yoyodyne, Inc., hereby disclaims all copyright interest in the - library `Frob' (a library for tweaking knobs) written by James - Random Hacker. - - <signature of Ty Coon>, 1 April 1990 - Ty Coon, President of Vice - -That's all there is to it! - - diff --git a/SOURCES b/SOURCES deleted file mode 100644 index 102ab85..0000000 --- a/SOURCES +++ /dev/null @@ -1,21 +0,0 @@ -toolchain/build.git 75542e77b565c2af968e48c1b12b32f343d913ae Fix binutils-2.24 detection -toolchain/gmp.git e6b9669dafc6a5f83c80b4b4176359b78bccdc90 Add gmp-5.0.5.tar.bz2 -toolchain/mpfr.git bfcf1bfa38469208aaad8873cd4c68781061d90f add mpfr-3.1.1.tar.bz2 -toolchain/mpc.git 835d16e92eed875638a8b5d552034c3b1aae045b add mpc-1.0.1.tar.gz -toolchain/cloog.git 98972d5434ffcb4d11d2c81a46600e9a1cda9110 MinGW-w64 build fix (lacks ffs declaration) -toolchain/isl.git b05d4572958c5d497da793f3317084bab90c3033 add isl-0.11.1.tar.bz2 needed by GCC 4.8 with graphite -toolchain/ppl.git 8ba1875b4c5341d902321761022a6d2a0b5b19a4 add ppl-1.0.tar.bz2 -toolchain/expat.git 40172a0ae9d40a068f1e1a48ffcf6a1ccf765ed5 expat package for building gdb-7.3 -toolchain/binutils.git fff40e635995d00e3455f861a97d8cbf3ebb6b4e Merge "Add missing mtc1, mthc1, mfhc1 instructions to Ingenic's MXU patch." -toolchain/gcc.git 5e9aa7db94a40a0f2632b13bce2095a008fc34ac Merge "[gcc] Remove "-mstackrealign" option turned on by default on x86." -toolchain/gdb.git 24237bc8bc3001a82d6cd9685719c4679f721792 fix some build errors -toolchain/python.git 0d4194853e08d3244931523470331c00dfb94863 Fix python build inc_dirs[] and lib_dirs[] for linux/darwin -toolchain/perl.git 1121daca35c6c692602621eab28d4de19f0b347d Add -Dcc_as_ld to configure -toolchain/mclinker.git 5fca8b9c9c671d6c01f428c00ca131e65042a9fd Merge upstream mclinker 2.7 -toolchain/yasm.git 87c09baff80ca5bbe938392d8f320e621707f317 test commit -toolchain/clang.git (release_35) e8513044903a3766443ef6d5eab92e376f2eba32 Fix assertion failure on DeferredDeclsToEmit. -toolchain/llvm.git (release_35) 6797a08398a6b54589c37e8d8f9e9a26b4fb5621 [ndk] Fix inappropriate debug info assertion. -toolchain/compiler-rt.git (release_35) 58bfd9dc4d03012ed9e59dc21e6b2e098f9476f8 Implement __aeabi_idiv0 and __aeabi_uidiv0. -toolchain/clang.git (release_34) a284801985744bd9667b9d14f079b3a2e8b19130 [ndk] Define __USER_LABEL_PREFIX__ to empty string. -toolchain/llvm.git (release_34) b2631bf0dc9579ddfb4bc1afdf7998c12fa268af Merge "python: AC_PATH_PROG -> AC_PATH_PROGS and fix search order" into release_34 -toolchain/compiler-rt.git (release_34) 4f9b4718e8d836317c224955a1e87b7bb5252ae1 Update clear_cache to trunk@208591 diff --git a/bin/x86_64-linux-android-addr2line b/bin/x86_64-linux-android-addr2line Binary files differdeleted file mode 100755 index c49408f..0000000 --- a/bin/x86_64-linux-android-addr2line +++ /dev/null diff --git a/bin/x86_64-linux-android-ar b/bin/x86_64-linux-android-ar Binary files differdeleted file mode 100755 index 786be38..0000000 --- a/bin/x86_64-linux-android-ar +++ /dev/null diff --git a/bin/x86_64-linux-android-as b/bin/x86_64-linux-android-as Binary files differdeleted file mode 100755 index 97401ae..0000000 --- a/bin/x86_64-linux-android-as +++ /dev/null diff --git a/bin/x86_64-linux-android-c++ b/bin/x86_64-linux-android-c++ deleted file mode 120000 index 425d82a..0000000 --- a/bin/x86_64-linux-android-c++ +++ /dev/null @@ -1 +0,0 @@ -x86_64-linux-android-g++
\ No newline at end of file diff --git a/bin/x86_64-linux-android-c++filt b/bin/x86_64-linux-android-c++filt Binary files differdeleted file mode 100755 index 0f86435..0000000 --- a/bin/x86_64-linux-android-c++filt +++ /dev/null diff --git a/bin/x86_64-linux-android-cpp b/bin/x86_64-linux-android-cpp Binary files differdeleted file mode 100755 index 8ac38a9..0000000 --- a/bin/x86_64-linux-android-cpp +++ /dev/null diff --git a/bin/x86_64-linux-android-dwp b/bin/x86_64-linux-android-dwp Binary files differdeleted file mode 100755 index 19074b1..0000000 --- a/bin/x86_64-linux-android-dwp +++ /dev/null diff --git a/bin/x86_64-linux-android-elfedit b/bin/x86_64-linux-android-elfedit Binary files differdeleted file mode 100755 index 992504a..0000000 --- a/bin/x86_64-linux-android-elfedit +++ /dev/null diff --git a/bin/x86_64-linux-android-g++ b/bin/x86_64-linux-android-g++ Binary files differdeleted file mode 100755 index 5ca1275..0000000 --- a/bin/x86_64-linux-android-g++ +++ /dev/null diff --git a/bin/x86_64-linux-android-gcc b/bin/x86_64-linux-android-gcc Binary files differdeleted file mode 100755 index e9151ea..0000000 --- a/bin/x86_64-linux-android-gcc +++ /dev/null diff --git a/bin/x86_64-linux-android-gcc-4.8 b/bin/x86_64-linux-android-gcc-4.8 deleted file mode 120000 index c953618..0000000 --- a/bin/x86_64-linux-android-gcc-4.8 +++ /dev/null @@ -1 +0,0 @@ -x86_64-linux-android-gcc
\ No newline at end of file diff --git a/bin/x86_64-linux-android-gcc-ar b/bin/x86_64-linux-android-gcc-ar Binary files differdeleted file mode 100755 index 15f0f7a..0000000 --- a/bin/x86_64-linux-android-gcc-ar +++ /dev/null diff --git a/bin/x86_64-linux-android-gcc-nm b/bin/x86_64-linux-android-gcc-nm Binary files differdeleted file mode 100755 index b9d5b65..0000000 --- a/bin/x86_64-linux-android-gcc-nm +++ /dev/null diff --git a/bin/x86_64-linux-android-gcc-ranlib b/bin/x86_64-linux-android-gcc-ranlib Binary files differdeleted file mode 100755 index 05637dc..0000000 --- a/bin/x86_64-linux-android-gcc-ranlib +++ /dev/null diff --git a/bin/x86_64-linux-android-gcov b/bin/x86_64-linux-android-gcov Binary files differdeleted file mode 100755 index 6381d9c..0000000 --- a/bin/x86_64-linux-android-gcov +++ /dev/null diff --git a/bin/x86_64-linux-android-gdb b/bin/x86_64-linux-android-gdb Binary files differdeleted file mode 100755 index 0c3d4e5..0000000 --- a/bin/x86_64-linux-android-gdb +++ /dev/null diff --git a/bin/x86_64-linux-android-gprof b/bin/x86_64-linux-android-gprof Binary files differdeleted file mode 100755 index 452517f..0000000 --- a/bin/x86_64-linux-android-gprof +++ /dev/null diff --git a/bin/x86_64-linux-android-ld b/bin/x86_64-linux-android-ld deleted file mode 120000 index 3d3aa39..0000000 --- a/bin/x86_64-linux-android-ld +++ /dev/null @@ -1 +0,0 @@ -x86_64-linux-android-ld.gold
\ No newline at end of file diff --git a/bin/x86_64-linux-android-ld.bfd b/bin/x86_64-linux-android-ld.bfd Binary files differdeleted file mode 100755 index ed06964..0000000 --- a/bin/x86_64-linux-android-ld.bfd +++ /dev/null diff --git a/bin/x86_64-linux-android-ld.gold b/bin/x86_64-linux-android-ld.gold Binary files differdeleted file mode 100755 index 2ebf7ff..0000000 --- a/bin/x86_64-linux-android-ld.gold +++ /dev/null diff --git a/bin/x86_64-linux-android-ld.mcld b/bin/x86_64-linux-android-ld.mcld Binary files differdeleted file mode 100755 index 8f2f743..0000000 --- a/bin/x86_64-linux-android-ld.mcld +++ /dev/null diff --git a/bin/x86_64-linux-android-nm b/bin/x86_64-linux-android-nm Binary files differdeleted file mode 100755 index 446f60b..0000000 --- a/bin/x86_64-linux-android-nm +++ /dev/null diff --git a/bin/x86_64-linux-android-objcopy b/bin/x86_64-linux-android-objcopy Binary files differdeleted file mode 100755 index 08a9cf4..0000000 --- a/bin/x86_64-linux-android-objcopy +++ /dev/null diff --git a/bin/x86_64-linux-android-objdump b/bin/x86_64-linux-android-objdump Binary files differdeleted file mode 100755 index 375ad3d..0000000 --- a/bin/x86_64-linux-android-objdump +++ /dev/null diff --git a/bin/x86_64-linux-android-ranlib b/bin/x86_64-linux-android-ranlib Binary files differdeleted file mode 100755 index 6586879..0000000 --- a/bin/x86_64-linux-android-ranlib +++ /dev/null diff --git a/bin/x86_64-linux-android-readelf b/bin/x86_64-linux-android-readelf Binary files differdeleted file mode 100755 index 2d0fcef..0000000 --- a/bin/x86_64-linux-android-readelf +++ /dev/null diff --git a/bin/x86_64-linux-android-size b/bin/x86_64-linux-android-size Binary files differdeleted file mode 100755 index 3fa9633..0000000 --- a/bin/x86_64-linux-android-size +++ /dev/null diff --git a/bin/x86_64-linux-android-strings b/bin/x86_64-linux-android-strings Binary files differdeleted file mode 100755 index 3c52910..0000000 --- a/bin/x86_64-linux-android-strings +++ /dev/null diff --git a/bin/x86_64-linux-android-strip b/bin/x86_64-linux-android-strip Binary files differdeleted file mode 100755 index 308b7f1..0000000 --- a/bin/x86_64-linux-android-strip +++ /dev/null diff --git a/include/gdb/jit-reader.h b/include/gdb/jit-reader.h deleted file mode 100644 index 7cff81a..0000000 --- a/include/gdb/jit-reader.h +++ /dev/null @@ -1,346 +0,0 @@ -/* JIT declarations for GDB, the GNU Debugger. - - Copyright (C) 2011-2013 Free Software Foundation, Inc. - - This file is part of GDB. - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program. If not, see <http://www.gnu.org/licenses/>. */ - -#ifndef GDB_JIT_READER_H -#define GDB_JIT_READER_H - -#ifdef __cplusplus -extern "C" { -#endif - -/* Versioning information. See gdb_reader_funcs. */ - -#define GDB_READER_INTERFACE_VERSION 1 - -/* Readers must be released under a GPL compatible license. To - declare that the reader is indeed released under a GPL compatible - license, invoke the macro GDB_DECLARE_GPL_COMPATIBLE in a source - file. */ - -#ifdef __cplusplus -#define GDB_DECLARE_GPL_COMPATIBLE_READER \ - extern "C" { \ - extern int plugin_is_GPL_compatible (void); \ - extern int plugin_is_GPL_compatible (void) \ - { \ - return 0; \ - } \ - } - -#else - -#define GDB_DECLARE_GPL_COMPATIBLE_READER \ - extern int plugin_is_GPL_compatible (void); \ - extern int plugin_is_GPL_compatible (void) \ - { \ - return 0; \ - } - -#endif - -/* Represents an address on the target system. */ - -typedef unsigned long GDB_CORE_ADDR; - -/* Return status codes. */ - -enum gdb_status { - GDB_FAIL = 0, - GDB_SUCCESS = 1 -}; - -struct gdb_object; -struct gdb_symtab; -struct gdb_block; -struct gdb_symbol_callbacks; - -/* An array of these are used to represent a map from code addresses to line - numbers in the source file. */ - -struct gdb_line_mapping -{ - int line; - GDB_CORE_ADDR pc; -}; - -/* Create a new GDB code object. Each code object can have one or - more symbol tables, each representing a compiled source file. */ - -typedef struct gdb_object *(gdb_object_open) (struct gdb_symbol_callbacks *cb); - -/* The callback used to create new symbol table. CB is the - gdb_symbol_callbacks which the structure is part of. FILE_NAME is - an (optionally NULL) file name to associate with this new symbol - table. - - Returns a new instance to gdb_symtab that can later be passed to - gdb_block_new, gdb_symtab_add_line_mapping and gdb_symtab_close. */ - -typedef struct gdb_symtab *(gdb_symtab_open) (struct gdb_symbol_callbacks *cb, - struct gdb_object *obj, - const char *file_name); - -/* Creates a new block in a given symbol table. A symbol table is a - forest of blocks, each block representing an code address range and - a corresponding (optionally NULL) NAME. In case the block - corresponds to a function, the NAME passed should be the name of - the function. - - If the new block to be created is a child of (i.e. is nested in) - another block, the parent block can be passed in PARENT. SYMTAB is - the symbol table the new block is to belong in. BEGIN, END is the - code address range the block corresponds to. - - Returns a new instance of gdb_block, which, as of now, has no use. - Note that the gdb_block returned must not be freed by the - caller. */ - -typedef struct gdb_block *(gdb_block_open) (struct gdb_symbol_callbacks *cb, - struct gdb_symtab *symtab, - struct gdb_block *parent, - GDB_CORE_ADDR begin, - GDB_CORE_ADDR end, - const char *name); - -/* Adds a PC to line number mapping for the symbol table SYMTAB. - NLINES is the number of elements in LINES, each element - corresponding to one (PC, line) pair. */ - -typedef void (gdb_symtab_add_line_mapping) (struct gdb_symbol_callbacks *cb, - struct gdb_symtab *symtab, - int nlines, - struct gdb_line_mapping *lines); - -/* Close the symtab SYMTAB. This signals to GDB that no more blocks - will be opened on this symtab. */ - -typedef void (gdb_symtab_close) (struct gdb_symbol_callbacks *cb, - struct gdb_symtab *symtab); - - -/* Closes the gdb_object OBJ and adds the emitted information into - GDB's internal structures. Once this is done, the debug - information will be picked up and used; this will usually be the - last operation in gdb_read_debug_info. */ - -typedef void (gdb_object_close) (struct gdb_symbol_callbacks *cb, - struct gdb_object *obj); - -/* Reads LEN bytes from TARGET_MEM in the target's virtual address - space into GDB_BUF. - - Returns GDB_FAIL on failure, and GDB_SUCCESS on success. */ - -typedef enum gdb_status (gdb_target_read) (GDB_CORE_ADDR target_mem, - void *gdb_buf, int len); - -/* The list of callbacks that are passed to read. These callbacks are - to be used to construct the symbol table. The functions have been - described above. */ - -struct gdb_symbol_callbacks -{ - gdb_object_open *object_open; - gdb_symtab_open *symtab_open; - gdb_block_open *block_open; - gdb_symtab_close *symtab_close; - gdb_object_close *object_close; - - gdb_symtab_add_line_mapping *line_mapping_add; - gdb_target_read *target_read; - - /* For internal use by GDB. */ - void *priv_data; -}; - -/* Forward declaration. */ - -struct gdb_reg_value; - -/* A function of this type is used to free a gdb_reg_value. See the - comment on `free' in struct gdb_reg_value. */ - -typedef void (gdb_reg_value_free) (struct gdb_reg_value *); - -/* Denotes the value of a register. */ - -struct gdb_reg_value -{ - /* The size of the register in bytes. The reader need not set this - field. This will be set for (defined) register values being read - from GDB using reg_get. */ - int size; - - /* Set to non-zero if the value for the register is known. The - registers for which the reader does not call reg_set are also - assumed to be undefined */ - int defined; - - /* Since gdb_reg_value is a variable sized structure, it will - usually be allocated on the heap. This function is expected to - contain the corresponding "free" function. - - When a pointer to gdb_reg_value is being sent from GDB to the - reader (via gdb_unwind_reg_get), the reader is expected to call - this function (with the same gdb_reg_value as argument) once it - is done with the value. - - When the function sends the a gdb_reg_value to GDB (via - gdb_unwind_reg_set), it is expected to set this field to point to - an appropriate cleanup routine (or to NULL if no cleanup is - required). */ - gdb_reg_value_free *free; - - /* The value of the register. */ - unsigned char value[1]; -}; - -/* get_frame_id in gdb_reader_funcs is to return a gdb_frame_id - corresponding to the current frame. The registers corresponding to - the current frame can be read using reg_get. Calling get_frame_id - on a particular frame should return the same gdb_frame_id - throughout its lifetime (i.e. till before it gets unwound). One - way to do this is by having the CODE_ADDRESS point to the - function's first instruction and STACK_ADDRESS point to the value - of the stack pointer when entering the function. */ - -struct gdb_frame_id -{ - GDB_CORE_ADDR code_address; - GDB_CORE_ADDR stack_address; -}; - -/* Forward declaration. */ - -struct gdb_unwind_callbacks; - -/* Returns the value of a particular register in the current frame. - The current frame is the frame that needs to be unwound into the - outer (earlier) frame. - - CB is the struct gdb_unwind_callbacks * the callback belongs to. - REGNUM is the DWARF register number of the register that needs to - be unwound. - - Returns the gdb_reg_value corresponding to the register requested. - In case the value of the register has been optimized away or - otherwise unavailable, the defined flag in the returned - gdb_reg_value will be zero. */ - -typedef struct gdb_reg_value *(gdb_unwind_reg_get) - (struct gdb_unwind_callbacks *cb, int regnum); - -/* Sets the previous value of a particular register. REGNUM is the - (DWARF) register number whose value is to be set. VAL is the value - the register is to be set to. - - VAL is *not* copied, so the memory allocated to it cannot be - reused. Once GDB no longer needs the value, it is deallocated - using the FREE function (see gdb_reg_value). - - A register can also be "set" to an undefined value by setting the - defined in VAL to zero. */ - -typedef void (gdb_unwind_reg_set) (struct gdb_unwind_callbacks *cb, int regnum, - struct gdb_reg_value *val); - -/* This struct is passed to unwind in gdb_reader_funcs, and is to be - used to unwind the current frame (current being the frame whose - registers can be read using reg_get) into the earlier frame. The - functions have been described above. */ - -struct gdb_unwind_callbacks -{ - gdb_unwind_reg_get *reg_get; - gdb_unwind_reg_set *reg_set; - gdb_target_read *target_read; - - /* For internal use by GDB. */ - void *priv_data; -}; - -/* Forward declaration. */ - -struct gdb_reader_funcs; - -/* Parse the debug info off a block of memory, pointed to by MEMORY - (already copied to GDB's address space) and MEMORY_SZ bytes long. - The implementation has to use the functions in CB to actually emit - the parsed data into GDB. SELF is the same structure returned by - gdb_init_reader. - - Return GDB_FAIL on failure and GDB_SUCCESS on success. */ - -typedef enum gdb_status (gdb_read_debug_info) (struct gdb_reader_funcs *self, - struct gdb_symbol_callbacks *cb, - void *memory, long memory_sz); - -/* Unwind the current frame, CB is the set of unwind callbacks that - are to be used to do this. - - Return GDB_FAIL on failure and GDB_SUCCESS on success. */ - -typedef enum gdb_status (gdb_unwind_frame) (struct gdb_reader_funcs *self, - struct gdb_unwind_callbacks *cb); - -/* Return the frame ID corresponding to the current frame, using C to - read the current register values. See the comment on struct - gdb_frame_id. */ - -typedef struct gdb_frame_id (gdb_get_frame_id) (struct gdb_reader_funcs *self, - struct gdb_unwind_callbacks *c); - -/* Called when a reader is being unloaded. This function should also - free SELF, if required. */ - -typedef void (gdb_destroy_reader) (struct gdb_reader_funcs *self); - -/* Called when the reader is loaded. Must either return a properly - populated gdb_reader_funcs or NULL. The memory allocated for the - gdb_reader_funcs is to be managed by the reader itself (i.e. if it - is allocated from the heap, it must also be freed in - gdb_destroy_reader). */ - -extern struct gdb_reader_funcs *gdb_init_reader (void); - -/* Pointer to the functions which implement the reader's - functionality. The individual functions have been documented - above. - - None of the fields are optional. */ - -struct gdb_reader_funcs -{ - /* Must be set to GDB_READER_INTERFACE_VERSION. */ - int reader_version; - - /* For use by the reader. */ - void *priv_data; - - gdb_read_debug_info *read; - gdb_unwind_frame *unwind; - gdb_get_frame_id *get_frame_id; - gdb_destroy_reader *destroy; -}; - -#ifdef __cplusplus -} /* extern "C" */ -#endif - -#endif diff --git a/lib/gcc/x86_64-linux-android/4.8/32/crtbegin.o b/lib/gcc/x86_64-linux-android/4.8/32/crtbegin.o Binary files differdeleted file mode 100644 index 2ba9dc2..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/32/crtbegin.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/32/crtbeginS.o b/lib/gcc/x86_64-linux-android/4.8/32/crtbeginS.o Binary files differdeleted file mode 100644 index d1fcc79..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/32/crtbeginS.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/32/crtbeginT.o b/lib/gcc/x86_64-linux-android/4.8/32/crtbeginT.o Binary files differdeleted file mode 100644 index 2ba9dc2..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/32/crtbeginT.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/32/crtend.o b/lib/gcc/x86_64-linux-android/4.8/32/crtend.o Binary files differdeleted file mode 100644 index 71ae4dd..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/32/crtend.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/32/crtendS.o b/lib/gcc/x86_64-linux-android/4.8/32/crtendS.o Binary files differdeleted file mode 100644 index 71ae4dd..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/32/crtendS.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/32/crtfastmath.o b/lib/gcc/x86_64-linux-android/4.8/32/crtfastmath.o Binary files differdeleted file mode 100644 index 923638f..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/32/crtfastmath.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/32/crtprec32.o b/lib/gcc/x86_64-linux-android/4.8/32/crtprec32.o Binary files differdeleted file mode 100644 index ad4d126..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/32/crtprec32.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/32/crtprec64.o b/lib/gcc/x86_64-linux-android/4.8/32/crtprec64.o Binary files differdeleted file mode 100644 index db642f9..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/32/crtprec64.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/32/crtprec80.o b/lib/gcc/x86_64-linux-android/4.8/32/crtprec80.o Binary files differdeleted file mode 100644 index ff98125..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/32/crtprec80.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/32/libgcc.a b/lib/gcc/x86_64-linux-android/4.8/32/libgcc.a Binary files differdeleted file mode 100644 index d50e7a7..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/32/libgcc.a +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/32/libgcov.a b/lib/gcc/x86_64-linux-android/4.8/32/libgcov.a Binary files differdeleted file mode 100644 index 38ee991..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/32/libgcov.a +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/crtbegin.o b/lib/gcc/x86_64-linux-android/4.8/crtbegin.o Binary files differdeleted file mode 100644 index 4982353..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/crtbegin.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/crtbeginS.o b/lib/gcc/x86_64-linux-android/4.8/crtbeginS.o Binary files differdeleted file mode 100644 index a44e714..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/crtbeginS.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/crtbeginT.o b/lib/gcc/x86_64-linux-android/4.8/crtbeginT.o Binary files differdeleted file mode 100644 index 4982353..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/crtbeginT.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/crtend.o b/lib/gcc/x86_64-linux-android/4.8/crtend.o Binary files differdeleted file mode 100644 index 23792ab..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/crtend.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/crtendS.o b/lib/gcc/x86_64-linux-android/4.8/crtendS.o Binary files differdeleted file mode 100644 index 23792ab..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/crtendS.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/crtfastmath.o b/lib/gcc/x86_64-linux-android/4.8/crtfastmath.o Binary files differdeleted file mode 100644 index 2f71001..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/crtfastmath.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/crtprec32.o b/lib/gcc/x86_64-linux-android/4.8/crtprec32.o Binary files differdeleted file mode 100644 index 969b383..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/crtprec32.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/crtprec64.o b/lib/gcc/x86_64-linux-android/4.8/crtprec64.o Binary files differdeleted file mode 100644 index 69092e3..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/crtprec64.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/crtprec80.o b/lib/gcc/x86_64-linux-android/4.8/crtprec80.o Binary files differdeleted file mode 100644 index 4813cfa..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/crtprec80.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/include-fixed/README b/lib/gcc/x86_64-linux-android/4.8/include-fixed/README deleted file mode 100644 index 7086a77..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include-fixed/README +++ /dev/null @@ -1,14 +0,0 @@ -This README file is copied into the directory for GCC-only header files -when fixincludes is run by the makefile for GCC. - -Many of the files in this directory were automatically edited from the -standard system header files by the fixincludes process. They are -system-specific, and will not work on any other kind of system. They -are also not part of GCC. The reason we have to do this is because -GCC requires ANSI C headers and many vendors supply ANSI-incompatible -headers. - -Because this is an automated process, sometimes headers get "fixed" -that do not, strictly speaking, need a fix. As long as nothing is broken -by the process, it is just an unfortunate collateral inconvenience. -We would like to rectify it, if it is not "too inconvenient". diff --git a/lib/gcc/x86_64-linux-android/4.8/include-fixed/limits.h b/lib/gcc/x86_64-linux-android/4.8/include-fixed/limits.h deleted file mode 100644 index 9640a88..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include-fixed/limits.h +++ /dev/null @@ -1,171 +0,0 @@ -/* Copyright (C) 1992-2013 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify it under -the terms of the GNU General Public License as published by the Free -Software Foundation; either version 3, or (at your option) any later -version. - -GCC is distributed in the hope that it will be useful, but WITHOUT ANY -WARRANTY; without even the implied warranty of MERCHANTABILITY or -FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License -for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. */ - -/* This administrivia gets added to the beginning of limits.h - if the system has its own version of limits.h. */ - -/* We use _GCC_LIMITS_H_ because we want this not to match - any macros that the system's limits.h uses for its own purposes. */ -#ifndef _GCC_LIMITS_H_ /* Terminated in limity.h. */ -#define _GCC_LIMITS_H_ - -#ifndef _LIBC_LIMITS_H_ -/* Use "..." so that we find syslimits.h only in this same directory. */ -#include "syslimits.h" -#endif -/* Copyright (C) 1991-2013 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify it under -the terms of the GNU General Public License as published by the Free -Software Foundation; either version 3, or (at your option) any later -version. - -GCC is distributed in the hope that it will be useful, but WITHOUT ANY -WARRANTY; without even the implied warranty of MERCHANTABILITY or -FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License -for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. */ - -#ifndef _LIMITS_H___ -#define _LIMITS_H___ - -/* Number of bits in a `char'. */ -#undef CHAR_BIT -#define CHAR_BIT __CHAR_BIT__ - -/* Maximum length of a multibyte character. */ -#ifndef MB_LEN_MAX -#define MB_LEN_MAX 1 -#endif - -/* Minimum and maximum values a `signed char' can hold. */ -#undef SCHAR_MIN -#define SCHAR_MIN (-SCHAR_MAX - 1) -#undef SCHAR_MAX -#define SCHAR_MAX __SCHAR_MAX__ - -/* Maximum value an `unsigned char' can hold. (Minimum is 0). */ -#undef UCHAR_MAX -#if __SCHAR_MAX__ == __INT_MAX__ -# define UCHAR_MAX (SCHAR_MAX * 2U + 1U) -#else -# define UCHAR_MAX (SCHAR_MAX * 2 + 1) -#endif - -/* Minimum and maximum values a `char' can hold. */ -#ifdef __CHAR_UNSIGNED__ -# undef CHAR_MIN -# if __SCHAR_MAX__ == __INT_MAX__ -# define CHAR_MIN 0U -# else -# define CHAR_MIN 0 -# endif -# undef CHAR_MAX -# define CHAR_MAX UCHAR_MAX -#else -# undef CHAR_MIN -# define CHAR_MIN SCHAR_MIN -# undef CHAR_MAX -# define CHAR_MAX SCHAR_MAX -#endif - -/* Minimum and maximum values a `signed short int' can hold. */ -#undef SHRT_MIN -#define SHRT_MIN (-SHRT_MAX - 1) -#undef SHRT_MAX -#define SHRT_MAX __SHRT_MAX__ - -/* Maximum value an `unsigned short int' can hold. (Minimum is 0). */ -#undef USHRT_MAX -#if __SHRT_MAX__ == __INT_MAX__ -# define USHRT_MAX (SHRT_MAX * 2U + 1U) -#else -# define USHRT_MAX (SHRT_MAX * 2 + 1) -#endif - -/* Minimum and maximum values a `signed int' can hold. */ -#undef INT_MIN -#define INT_MIN (-INT_MAX - 1) -#undef INT_MAX -#define INT_MAX __INT_MAX__ - -/* Maximum value an `unsigned int' can hold. (Minimum is 0). */ -#undef UINT_MAX -#define UINT_MAX (INT_MAX * 2U + 1U) - -/* Minimum and maximum values a `signed long int' can hold. - (Same as `int'). */ -#undef LONG_MIN -#define LONG_MIN (-LONG_MAX - 1L) -#undef LONG_MAX -#define LONG_MAX __LONG_MAX__ - -/* Maximum value an `unsigned long int' can hold. (Minimum is 0). */ -#undef ULONG_MAX -#define ULONG_MAX (LONG_MAX * 2UL + 1UL) - -#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L -/* Minimum and maximum values a `signed long long int' can hold. */ -# undef LLONG_MIN -# define LLONG_MIN (-LLONG_MAX - 1LL) -# undef LLONG_MAX -# define LLONG_MAX __LONG_LONG_MAX__ - -/* Maximum value an `unsigned long long int' can hold. (Minimum is 0). */ -# undef ULLONG_MAX -# define ULLONG_MAX (LLONG_MAX * 2ULL + 1ULL) -#endif - -#if defined (__GNU_LIBRARY__) ? defined (__USE_GNU) : !defined (__STRICT_ANSI__) -/* Minimum and maximum values a `signed long long int' can hold. */ -# undef LONG_LONG_MIN -# define LONG_LONG_MIN (-LONG_LONG_MAX - 1LL) -# undef LONG_LONG_MAX -# define LONG_LONG_MAX __LONG_LONG_MAX__ - -/* Maximum value an `unsigned long long int' can hold. (Minimum is 0). */ -# undef ULONG_LONG_MAX -# define ULONG_LONG_MAX (LONG_LONG_MAX * 2ULL + 1ULL) -#endif - -#endif /* _LIMITS_H___ */ -/* This administrivia gets added to the end of limits.h - if the system has its own version of limits.h. */ - -#else /* not _GCC_LIMITS_H_ */ - -#ifdef _GCC_NEXT_LIMITS_H -#include_next <limits.h> /* recurse down to the real one */ -#endif - -#endif /* not _GCC_LIMITS_H_ */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include-fixed/syslimits.h b/lib/gcc/x86_64-linux-android/4.8/include-fixed/syslimits.h deleted file mode 100644 index a362802..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include-fixed/syslimits.h +++ /dev/null @@ -1,8 +0,0 @@ -/* syslimits.h stands for the system's own limits.h file. - If we can use it ok unmodified, then we install this text. - If fixincludes fixes it, then the fixed version is installed - instead of this text. */ - -#define _GCC_NEXT_LIMITS_H /* tell gcc's limits.h to recurse */ -#include_next <limits.h> -#undef _GCC_NEXT_LIMITS_H diff --git a/lib/gcc/x86_64-linux-android/4.8/include/adxintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/adxintrin.h deleted file mode 100644 index 5c0ea9f..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/adxintrin.h +++ /dev/null @@ -1,49 +0,0 @@ -/* Copyright (C) 2012-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#if !defined _X86INTRIN_H_INCLUDED && !defined _IMMINTRIN_H_INCLUDED -# error "Never use <adxintrin.h> directly; include <x86intrin.h> instead." -#endif - -#ifndef _ADXINTRIN_H_INCLUDED -#define _ADXINTRIN_H_INCLUDED - -extern __inline unsigned char -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_addcarryx_u32 (unsigned char __CF, unsigned int __X, - unsigned int __Y, unsigned int *__P) -{ - return __builtin_ia32_addcarryx_u32 (__CF, __X, __Y, __P); -} - -#ifdef __x86_64__ -extern __inline unsigned char -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_addcarryx_u64 (unsigned char __CF, unsigned long __X, - unsigned long __Y, unsigned long long *__P) -{ - return __builtin_ia32_addcarryx_u64 (__CF, __X, __Y, __P); -} -#endif - -#endif /* _ADXINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/ammintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/ammintrin.h deleted file mode 100644 index 311292c..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/ammintrin.h +++ /dev/null @@ -1,88 +0,0 @@ -/* Copyright (C) 2007-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* Implemented from the specification included in the AMD Programmers - Manual Update, version 2.x */ - -#ifndef _AMMINTRIN_H_INCLUDED -#define _AMMINTRIN_H_INCLUDED - -#ifndef __SSE4A__ -# error "SSE4A instruction set not enabled" -#else - -/* We need definitions from the SSE3, SSE2 and SSE header files*/ -#include <pmmintrin.h> - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_stream_sd (double * __P, __m128d __Y) -{ - __builtin_ia32_movntsd (__P, (__v2df) __Y); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_stream_ss (float * __P, __m128 __Y) -{ - __builtin_ia32_movntss (__P, (__v4sf) __Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_extract_si64 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_extrq ((__v2di) __X, (__v16qi) __Y); -} - -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_extracti_si64 (__m128i __X, unsigned const int __I, unsigned const int __L) -{ - return (__m128i) __builtin_ia32_extrqi ((__v2di) __X, __I, __L); -} -#else -#define _mm_extracti_si64(X, I, L) \ - ((__m128i) __builtin_ia32_extrqi ((__v2di)(__m128i)(X), \ - (unsigned int)(I), (unsigned int)(L))) -#endif - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_insert_si64 (__m128i __X,__m128i __Y) -{ - return (__m128i) __builtin_ia32_insertq ((__v2di)__X, (__v2di)__Y); -} - -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_inserti_si64(__m128i __X, __m128i __Y, unsigned const int __I, unsigned const int __L) -{ - return (__m128i) __builtin_ia32_insertqi ((__v2di)__X, (__v2di)__Y, __I, __L); -} -#else -#define _mm_inserti_si64(X, Y, I, L) \ - ((__m128i) __builtin_ia32_insertqi ((__v2di)(__m128i)(X), \ - (__v2di)(__m128i)(Y), \ - (unsigned int)(I), (unsigned int)(L))) -#endif - -#endif /* __SSE4A__ */ - -#endif /* _AMMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/arm_neon.h b/lib/gcc/x86_64-linux-android/4.8/include/arm_neon.h deleted file mode 100644 index 10944f3..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/arm_neon.h +++ /dev/null @@ -1,16622 +0,0 @@ -//created by Victoria Zhislina, the Senior Application Engineer, Intel Corporation, victoria.zhislina@intel.com - -//*** Copyright (C) 2012-2014 Intel Corporation. All rights reserved. - -//IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING. - -//By downloading, copying, installing or using the software you agree to this license. -//If you do not agree to this license, do not download, install, copy or use the software. - -// License Agreement - -//Permission to use, copy, modify, and/or distribute this software for any -//purpose with or without fee is hereby granted, provided that the above -//copyright notice and this permission notice appear in all copies. - -//THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH -//REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY -//AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, -//INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM -//LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR -//OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR -//PERFORMANCE OF THIS SOFTWARE. - -//***************************************************************************************** -// This file is intended to simplify ARM->IA32 porting -// It makes the correspondence between ARM NEON intrinsics (as defined in "arm_neon.h") -// and x86 SSE(up to SSE4.2) intrinsic functions as defined in headers files below -// MMX instruction set is not used due to performance overhead and the necessity to use the -// EMMS instruction (_mm_empty())for mmx-x87 floating point switching -//***************************************************************************************** - -//!!!!!!! To use this file in your project that uses ARM NEON intinsics just keep arm_neon.h included and complile it as usual. -//!!!!!!! Please pay attention at USE_SSE4 below - you need to define it for newest Intel platforms for -//!!!!!!! greater performance. It can be done by -msse4.2 compiler switch. - -#ifndef NEON2SSE_H -#define NEON2SSE_H - -#ifndef USE_SSE4 -#if defined(__SSE4_2__) - #define USE_SSE4 -#endif -#endif - -#include <xmmintrin.h> //SSE -#include <emmintrin.h> //SSE2 -#include <pmmintrin.h> //SSE3 -#include <tmmintrin.h> //SSSE3 -#ifdef USE_SSE4 -#include <smmintrin.h> //SSE4.1 -#include <nmmintrin.h> //SSE4.2 -#endif - - -//*************** functions and data attributes, compiler dependent ********************************* -//*********************************************************************************** -#ifdef __GNUC__ -#define _GCC_VERSION (__GNUC__ * 10000 + __GNUC_MINOR__ * 100 + __GNUC_PATCHLEVEL__) -#define _NEON2SSE_ALIGN_16 __attribute__((aligned(16))) -#define _NEON2SSE_INLINE extern inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -#if _GCC_VERSION < 40500 - #define _NEON2SSE_PERFORMANCE_WARNING(function, explanation) __attribute__((deprecated)) function -#else - #define _NEON2SSE_PERFORMANCE_WARNING(function, explanation) __attribute__((deprecated(explanation))) function -#endif -#if defined(__x86_64__) - #define _NEON2SSE_64BIT __x86_64__ -#endif -#else -#define _NEON2SSE_ALIGN_16 __declspec(align(16)) -#define _NEON2SSE_INLINE __inline -#if defined(_MSC_VER)|| defined (__INTEL_COMPILER) - #define _NEON2SSE_PERFORMANCE_WARNING(function, EXPLANATION) __declspec(deprecated(EXPLANATION)) function -#if defined(_M_X64) - #define _NEON2SSE_64BIT _M_X64 -#endif -#else - #define _NEON2SSE_PERFORMANCE_WARNING(function, explanation) function -#endif -#endif - -#if defined (_NEON2SSE_64BIT) && defined (USE_SSE4) - #define _NEON2SSE_64BIT_SSE4 -#endif - -/*********************************************************************************************************************/ -// data types conversion -/*********************************************************************************************************************/ -#if defined(_MSC_VER) && (_MSC_VER < 1300) - typedef signed char int8_t; - typedef unsigned char uint8_t; - typedef signed short int16_t; - typedef unsigned short uint16_t; - typedef signed int int32_t; - typedef unsigned int uint32_t; - typedef signed long long int64_t; - typedef unsigned long long uint64_t; -#elif defined(_MSC_VER) - typedef signed __int8 int8_t; - typedef unsigned __int8 uint8_t; - typedef signed __int16 int16_t; - typedef unsigned __int16 uint16_t; - typedef signed __int32 int32_t; - typedef unsigned __int32 uint32_t; - - typedef signed long long int64_t; - typedef unsigned long long uint64_t; -#else -#include <stdint.h> -#include <limits.h> -#endif - -typedef union __m64_128 { - uint64_t m64_u64[1]; - float m64_f32[2]; - int8_t m64_i8[8]; - int16_t m64_i16[4]; - int32_t m64_i32[2]; - int64_t m64_i64[1]; - uint8_t m64_u8[8]; - uint16_t m64_u16[4]; - uint32_t m64_u32[2]; -} __m64_128; - -typedef __m64_128 int8x8_t; -typedef __m64_128 uint8x8_t; -typedef __m64_128 int16x4_t; -typedef __m64_128 uint16x4_t; -typedef __m64_128 int32x2_t; -typedef __m64_128 uint32x2_t; -typedef __m64_128 int64x1_t; -typedef __m64_128 uint64x1_t; -typedef __m64_128 poly8x8_t; -typedef __m64_128 poly16x4_t; - -typedef __m64_128 float32x2_t; -typedef __m128 float32x4_t; - -typedef __m128 float16x4_t; //not supported by IA, for compatibility -typedef __m128 float16x8_t; //not supported by IA, for compatibility - -typedef __m128i int8x16_t; -typedef __m128i int16x8_t; -typedef __m128i int32x4_t; -typedef __m128i int64x2_t; -typedef __m128i uint8x16_t; -typedef __m128i uint16x8_t; -typedef __m128i uint32x4_t; -typedef __m128i uint64x2_t; -typedef __m128i poly8x16_t; -typedef __m128i poly16x8_t; - -#if defined(_MSC_VER) - #define SINT_MIN (-2147483647 - 1) /* min signed int value */ - #define SINT_MAX 2147483647 /* max signed int value */ -#else - #define SINT_MIN INT_MIN /* min signed int value */ - #define SINT_MAX INT_MAX /* max signed int value */ -#endif - -typedef float float32_t; -#if !defined(__clang__) -typedef float __fp16; -#endif - -typedef uint8_t poly8_t; -typedef uint16_t poly16_t; - - -//MSVC compilers (tested up to 2012 VS version) doesn't allow using structures or arrays of __m128x type as functions arguments resulting in -//error C2719: 'src': formal parameter with __declspec(align('16')) won't be aligned. To avoid it we need the special trick for functions that use these types -struct int8x16x2_t { - int8x16_t val[2]; -}; -struct int16x8x2_t { - int16x8_t val[2]; -}; -struct int32x4x2_t { - int32x4_t val[2]; -}; -struct int64x2x2_t { - int64x2_t val[2]; -}; -//Unfortunately we are unable to merge two 64-bits in on 128 bit register because user should be able to access val[n] members explicitly!!! -struct int8x8x2_t { - int8x8_t val[2]; -}; -struct int16x4x2_t { - int16x4_t val[2]; -}; -struct int32x2x2_t { - int32x2_t val[2]; -}; -struct int64x1x2_t { - int64x1_t val[2]; -}; - -typedef struct int8x16x2_t int8x16x2_t; //for C compilers to make them happy -typedef struct int16x8x2_t int16x8x2_t; //for C compilers to make them happy -typedef struct int32x4x2_t int32x4x2_t; //for C compilers to make them happy -typedef struct int64x2x2_t int64x2x2_t; //for C compilers to make them happy - -typedef struct int8x8x2_t int8x8x2_t; //for C compilers to make them happy -typedef struct int16x4x2_t int16x4x2_t; //for C compilers to make them happy -typedef struct int32x2x2_t int32x2x2_t; //for C compilers to make them happy -typedef struct int64x1x2_t int64x1x2_t; //for C compilers to make them happy - -/* to avoid pointer conversions the following unsigned integers structures are defined via the corresponding signed integers structures above */ -typedef struct int8x16x2_t uint8x16x2_t; -typedef struct int16x8x2_t uint16x8x2_t; -typedef struct int32x4x2_t uint32x4x2_t; -typedef struct int64x2x2_t uint64x2x2_t; -typedef struct int8x16x2_t poly8x16x2_t; -typedef struct int16x8x2_t poly16x8x2_t; - -typedef struct int8x8x2_t uint8x8x2_t; -typedef struct int16x4x2_t uint16x4x2_t; -typedef struct int32x2x2_t uint32x2x2_t; -typedef struct int64x1x2_t uint64x1x2_t; -typedef struct int8x8x2_t poly8x8x2_t; -typedef struct int16x4x2_t poly16x4x2_t; - -//float -struct float32x4x2_t { - float32x4_t val[2]; -}; -struct float16x8x2_t { - float16x8_t val[2]; -}; -struct float32x2x2_t { - float32x2_t val[2]; -}; - -typedef struct float32x4x2_t float32x4x2_t; //for C compilers to make them happy -typedef struct float16x8x2_t float16x8x2_t; //for C compilers to make them happy -typedef struct float32x2x2_t float32x2x2_t; //for C compilers to make them happy -typedef float16x8x2_t float16x4x2_t; - -//4 -struct int8x16x4_t { - int8x16_t val[4]; -}; -struct int16x8x4_t { - int16x8_t val[4]; -}; -struct int32x4x4_t { - int32x4_t val[4]; -}; -struct int64x2x4_t { - int64x2_t val[4]; -}; - -struct int8x8x4_t { - int8x8_t val[4]; -}; -struct int16x4x4_t { - int16x4_t val[4]; -}; -struct int32x2x4_t { - int32x2_t val[4]; -}; -struct int64x1x4_t { - int64x1_t val[4]; -}; - -typedef struct int8x16x4_t int8x16x4_t; //for C compilers to make them happy -typedef struct int16x8x4_t int16x8x4_t; //for C compilers to make them happy -typedef struct int32x4x4_t int32x4x4_t; //for C compilers to make them happy -typedef struct int64x2x4_t int64x2x4_t; //for C compilers to make them happy - -typedef struct int8x8x4_t int8x8x4_t; //for C compilers to make them happy -typedef struct int16x4x4_t int16x4x4_t; //for C compilers to make them happy -typedef struct int32x2x4_t int32x2x4_t; //for C compilers to make them happy -typedef struct int64x1x4_t int64x1x4_t; //for C compilers to make them happy - -/* to avoid pointer conversions the following unsigned integers structures are defined via the corresponding signed integers dealing structures above:*/ -typedef struct int8x8x4_t uint8x8x4_t; -typedef struct int16x4x4_t uint16x4x4_t; -typedef struct int32x2x4_t uint32x2x4_t; -typedef struct int64x1x4_t uint64x1x4_t; -typedef struct int8x8x4_t poly8x8x4_t; -typedef struct int16x4x4_t poly16x4x4_t; - -typedef struct int8x16x4_t uint8x16x4_t; -typedef struct int16x8x4_t uint16x8x4_t; -typedef struct int32x4x4_t uint32x4x4_t; -typedef struct int64x2x4_t uint64x2x4_t; -typedef struct int8x16x4_t poly8x16x4_t; -typedef struct int16x8x4_t poly16x8x4_t; - -struct float32x4x4_t { - float32x4_t val[4]; -}; -struct float16x8x4_t { - float16x8_t val[4]; -}; -struct float32x2x4_t { - float32x2_t val[4]; -}; - -typedef struct float32x4x4_t float32x4x4_t; //for C compilers to make them happy -typedef struct float16x8x4_t float16x8x4_t; //for C compilers to make them happy -typedef struct float32x2x4_t float32x2x4_t; //for C compilers to make them happy -typedef float16x8x4_t float16x4x4_t; - -//3 -struct int16x8x3_t { - int16x8_t val[3]; -}; -struct int32x4x3_t { - int32x4_t val[3]; -}; -struct int64x2x3_t { - int64x2_t val[3]; -}; -struct int8x16x3_t { - int8x16_t val[3]; -}; - -struct int16x4x3_t { - int16x4_t val[3]; -}; -struct int32x2x3_t { - int32x2_t val[3]; -}; -struct int64x1x3_t { - int64x1_t val[3]; -}; -struct int8x8x3_t { - int8x8_t val[3]; -}; -typedef struct int16x8x3_t int16x8x3_t; //for C compilers to make them happy -typedef struct int32x4x3_t int32x4x3_t; //for C compilers to make them happy -typedef struct int64x2x3_t int64x2x3_t; //for C compilers to make them happy -typedef struct int8x16x3_t int8x16x3_t; //for C compilers to make them happy - -typedef struct int8x8x3_t int8x8x3_t; //for C compilers to make them happy -typedef struct int16x4x3_t int16x4x3_t; //for C compilers to make them happy -typedef struct int32x2x3_t int32x2x3_t; //for C compilers to make them happy -typedef struct int64x1x3_t int64x1x3_t; //for C compilers to make them happy - - -/* to avoid pointer conversions the following unsigned integers structures are defined via the corresponding signed integers dealing structures above:*/ -typedef struct int8x16x3_t uint8x16x3_t; -typedef struct int16x8x3_t uint16x8x3_t; -typedef struct int32x4x3_t uint32x4x3_t; -typedef struct int64x2x3_t uint64x2x3_t; -typedef struct int8x16x3_t poly8x16x3_t; -typedef struct int16x8x3_t poly16x8x3_t; -typedef struct int8x8x3_t uint8x8x3_t; -typedef struct int16x4x3_t uint16x4x3_t; -typedef struct int32x2x3_t uint32x2x3_t; -typedef struct int64x1x3_t uint64x1x3_t; -typedef struct int8x8x3_t poly8x8x3_t; -typedef struct int16x4x3_t poly16x4x3_t; - -//float -struct float32x4x3_t { - float32x4_t val[3]; -}; -struct float32x2x3_t { - float32x2_t val[3]; -}; -struct float16x8x3_t { - float16x8_t val[3]; -}; - -typedef struct float32x4x3_t float32x4x3_t; //for C compilers to make them happy -typedef struct float16x8x3_t float16x8x3_t; //for C compilers to make them happy -typedef struct float32x2x3_t float32x2x3_t; //for C compilers to make them happy -typedef float16x8x3_t float16x4x3_t; - - -//**************************************************************************** -//****** Porting auxiliary macros ******************************************** - -//** floating point related macros ** -#define _M128i(a) _mm_castps_si128(a) -#define _M128(a) _mm_castsi128_ps(a) -//here the most performance effective implementation is compiler and 32/64 bits build dependent -#if defined (_NEON2SSE_64BIT) || (defined (__INTEL_COMPILER) && (__INTEL_COMPILER >= 1500) ) - - #define _pM128i(a) _mm_cvtsi64_si128(*(int64_t*)(&(a))) - #define _M64(out, inp) out.m64_i64[0] = _mm_cvtsi128_si64 (inp); - #define _M64f(out, inp) out.m64_i64[0] = _mm_cvtsi128_si64 (_M128i(inp)); -#else - //for 32bit gcc and Microsoft compilers builds - #define _pM128i(a) _mm_loadl_epi64((__m128i*)&(a)) - #define _M64(out, inp) _mm_storel_epi64 ((__m128i*)&(out), inp) - #define _M64f(out, inp) _mm_storel_epi64 ((__m128i*)&(out), _M128i(inp)) -#endif -#define _pM128(a) _mm_castsi128_ps(_pM128i(a)) - -#define return64(a) _M64(res64,a); return res64; -#define return64f(a) _M64f(res64,a); return res64; - -#define _Ui64(a) (*(uint64_t*)&(a)) -#define _UNSIGNED_T(a) u ## a - -#define _SIGNBIT64 ((uint64_t)1 << 63) -#define _SWAP_HI_LOW32 (2 | (3 << 2) | (0 << 4) | (1 << 6)) -#define _INSERTPS_NDX(srcField, dstField) (((srcField) << 6) | ((dstField) << 4) ) - -#define _NEON2SSE_REASON_SLOW_SERIAL "The function may be very slow due to the serial implementation, please try to avoid it" -#define _NEON2SSE_REASON_SLOW_UNEFFECTIVE "The function may be slow due to inefficient x86 SIMD implementation, please try to avoid it" - -//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -#define __constrange(min,max) const -#define __transfersize(size) -//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - -//************************************************************************* -//************************************************************************* -//********* Functions declarations as declared in original arm_neon.h ***** -//************************************************************************* -//Vector add: vadd -> Vr[i]:=Va[i]+Vb[i], Vr, Va, Vb have equal lane sizes. -int8x8_t vadd_s8(int8x8_t a, int8x8_t b); // VADD.I8 d0,d0,d0 -int16x4_t vadd_s16(int16x4_t a, int16x4_t b); // VADD.I16 d0,d0,d0 -int32x2_t vadd_s32(int32x2_t a, int32x2_t b); // VADD.I32 d0,d0,d0 -int64x1_t vadd_s64(int64x1_t a, int64x1_t b); // VADD.I64 d0,d0,d0 -float32x2_t vadd_f32(float32x2_t a, float32x2_t b); // VADD.F32 d0,d0,d0 -uint8x8_t vadd_u8(uint8x8_t a, uint8x8_t b); // VADD.I8 d0,d0,d0 -uint16x4_t vadd_u16(uint16x4_t a, uint16x4_t b); // VADD.I16 d0,d0,d0 -uint32x2_t vadd_u32(uint32x2_t a, uint32x2_t b); // VADD.I32 d0,d0,d0 -uint64x1_t vadd_u64(uint64x1_t a, uint64x1_t b); // VADD.I64 d0,d0,d0 -int8x16_t vaddq_s8(int8x16_t a, int8x16_t b); // VADD.I8 q0,q0,q0 -int16x8_t vaddq_s16(int16x8_t a, int16x8_t b); // VADD.I16 q0,q0,q0 -int32x4_t vaddq_s32(int32x4_t a, int32x4_t b); // VADD.I32 q0,q0,q0 -int64x2_t vaddq_s64(int64x2_t a, int64x2_t b); // VADD.I64 q0,q0,q0 -float32x4_t vaddq_f32(float32x4_t a, float32x4_t b); // VADD.F32 q0,q0,q0 -uint8x16_t vaddq_u8(uint8x16_t a, uint8x16_t b); // VADD.I8 q0,q0,q0 -uint16x8_t vaddq_u16(uint16x8_t a, uint16x8_t b); // VADD.I16 q0,q0,q0 -uint32x4_t vaddq_u32(uint32x4_t a, uint32x4_t b); // VADD.I32 q0,q0,q0 -uint64x2_t vaddq_u64(uint64x2_t a, uint64x2_t b); // VADD.I64 q0,q0,q0 -//Vector long add: vaddl -> Vr[i]:=Va[i]+Vb[i], Va, Vb have equal lane sizes, result is a 128 bit vector of lanes that are twice the width. -int16x8_t vaddl_s8(int8x8_t a, int8x8_t b); // VADDL.S8 q0,d0,d0 -int32x4_t vaddl_s16(int16x4_t a, int16x4_t b); // VADDL.S16 q0,d0,d0 -int64x2_t vaddl_s32(int32x2_t a, int32x2_t b); // VADDL.S32 q0,d0,d0 -uint16x8_t vaddl_u8(uint8x8_t a, uint8x8_t b); // VADDL.U8 q0,d0,d0 -uint32x4_t vaddl_u16(uint16x4_t a, uint16x4_t b); // VADDL.U16 q0,d0,d0 -uint64x2_t vaddl_u32(uint32x2_t a, uint32x2_t b); // VADDL.U32 q0,d0,d0 -//Vector wide addw: vadd -> Vr[i]:=Va[i]+Vb[i] -int16x8_t vaddw_s8(int16x8_t a, int8x8_t b); // VADDW.S8 q0,q0,d0 -int32x4_t vaddw_s16(int32x4_t a, int16x4_t b); // VADDW.S16 q0,q0,d0 -int64x2_t vaddw_s32(int64x2_t a, int32x2_t b); // VADDW.S32 q0,q0,d0 -uint16x8_t vaddw_u8(uint16x8_t a, uint8x8_t b); // VADDW.U8 q0,q0,d0 -uint32x4_t vaddw_u16(uint32x4_t a, uint16x4_t b); // VADDW.U16 q0,q0,d0 -uint64x2_t vaddw_u32(uint64x2_t a, uint32x2_t b); // VADDW.U32 q0,q0,d0 -//Vector halving add: vhadd -> Vr[i]:=(Va[i]+Vb[i])>>1 -int8x8_t vhadd_s8(int8x8_t a, int8x8_t b); // VHADD.S8 d0,d0,d0 -int16x4_t vhadd_s16(int16x4_t a, int16x4_t b); // VHADD.S16 d0,d0,d0 -int32x2_t vhadd_s32(int32x2_t a, int32x2_t b); // VHADD.S32 d0,d0,d0 -uint8x8_t vhadd_u8(uint8x8_t a, uint8x8_t b); // VHADD.U8 d0,d0,d0 -uint16x4_t vhadd_u16(uint16x4_t a, uint16x4_t b); // VHADD.U16 d0,d0,d0 -uint32x2_t vhadd_u32(uint32x2_t a, uint32x2_t b); // VHADD.U32 d0,d0,d0 -int8x16_t vhaddq_s8(int8x16_t a, int8x16_t b); // VHADD.S8 q0,q0,q0 -int16x8_t vhaddq_s16(int16x8_t a, int16x8_t b); // VHADD.S16 q0,q0,q0 -int32x4_t vhaddq_s32(int32x4_t a, int32x4_t b); // VHADD.S32 q0,q0,q0 -uint8x16_t vhaddq_u8(uint8x16_t a, uint8x16_t b); // VHADD.U8 q0,q0,q0 -uint16x8_t vhaddq_u16(uint16x8_t a, uint16x8_t b); // VHADD.U16 q0,q0,q0 -uint32x4_t vhaddq_u32(uint32x4_t a, uint32x4_t b); // VHADD.U32 q0,q0,q0 -//Vector rounding halving add: vrhadd -> Vr[i]:=(Va[i]+Vb[i]+1)>>1 -int8x8_t vrhadd_s8(int8x8_t a, int8x8_t b); // VRHADD.S8 d0,d0,d0 -int16x4_t vrhadd_s16(int16x4_t a, int16x4_t b); // VRHADD.S16 d0,d0,d0 -int32x2_t vrhadd_s32(int32x2_t a, int32x2_t b); // VRHADD.S32 d0,d0,d0 -uint8x8_t vrhadd_u8(uint8x8_t a, uint8x8_t b); // VRHADD.U8 d0,d0,d0 -uint16x4_t vrhadd_u16(uint16x4_t a, uint16x4_t b); // VRHADD.U16 d0,d0,d0 -uint32x2_t vrhadd_u32(uint32x2_t a, uint32x2_t b); // VRHADD.U32 d0,d0,d0 -int8x16_t vrhaddq_s8(int8x16_t a, int8x16_t b); // VRHADD.S8 q0,q0,q0 -int16x8_t vrhaddq_s16(int16x8_t a, int16x8_t b); // VRHADD.S16 q0,q0,q0 -int32x4_t vrhaddq_s32(int32x4_t a, int32x4_t b); // VRHADD.S32 q0,q0,q0 -uint8x16_t vrhaddq_u8(uint8x16_t a, uint8x16_t b); // VRHADD.U8 q0,q0,q0 -uint16x8_t vrhaddq_u16(uint16x8_t a, uint16x8_t b); // VRHADD.U16 q0,q0,q0 -uint32x4_t vrhaddq_u32(uint32x4_t a, uint32x4_t b); // VRHADD.U32 q0,q0,q0 -//Vector saturating add: vqadd -> Vr[i]:=sat<size>(Va[i]+Vb[i]) -int8x8_t vqadd_s8(int8x8_t a, int8x8_t b); // VQADD.S8 d0,d0,d0 -int16x4_t vqadd_s16(int16x4_t a, int16x4_t b); // VQADD.S16 d0,d0,d0 -int32x2_t vqadd_s32(int32x2_t a, int32x2_t b); // VQADD.S32 d0,d0,d0 -int64x1_t vqadd_s64(int64x1_t a, int64x1_t b); // VQADD.S64 d0,d0,d0 -uint8x8_t vqadd_u8(uint8x8_t a, uint8x8_t b); // VQADD.U8 d0,d0,d0 -uint16x4_t vqadd_u16(uint16x4_t a, uint16x4_t b); // VQADD.U16 d0,d0,d0 -uint32x2_t vqadd_u32(uint32x2_t a, uint32x2_t b); // VQADD.U32 d0,d0,d0 -uint64x1_t vqadd_u64(uint64x1_t a, uint64x1_t b); // VQADD.U64 d0,d0,d0 -int8x16_t vqaddq_s8(int8x16_t a, int8x16_t b); // VQADD.S8 q0,q0,q0 -int16x8_t vqaddq_s16(int16x8_t a, int16x8_t b); // VQADD.S16 q0,q0,q0 -int32x4_t vqaddq_s32(int32x4_t a, int32x4_t b); // VQADD.S32 q0,q0,q0 -int64x2_t vqaddq_s64(int64x2_t a, int64x2_t b); // VQADD.S64 q0,q0,q0 -uint8x16_t vqaddq_u8(uint8x16_t a, uint8x16_t b); // VQADD.U8 q0,q0,q0 -uint16x8_t vqaddq_u16(uint16x8_t a, uint16x8_t b); // VQADD.U16 q0,q0,q0 -uint32x4_t vqaddq_u32(uint32x4_t a, uint32x4_t b); // VQADD.U32 q0,q0,q0 -uint64x2_t vqaddq_u64(uint64x2_t a, uint64x2_t b); // VQADD.U64 q0,q0,q0 -//Vector add high half: vaddhn-> Vr[i]:=Va[i]+Vb[i] -int8x8_t vaddhn_s16(int16x8_t a, int16x8_t b); // VADDHN.I16 d0,q0,q0 -int16x4_t vaddhn_s32(int32x4_t a, int32x4_t b); // VADDHN.I32 d0,q0,q0 -int32x2_t vaddhn_s64(int64x2_t a, int64x2_t b); // VADDHN.I64 d0,q0,q0 -uint8x8_t vaddhn_u16(uint16x8_t a, uint16x8_t b); // VADDHN.I16 d0,q0,q0 -uint16x4_t vaddhn_u32(uint32x4_t a, uint32x4_t b); // VADDHN.I32 d0,q0,q0 -uint32x2_t vaddhn_u64(uint64x2_t a, uint64x2_t b); // VADDHN.I64 d0,q0,q0 -//Vector rounding add high half: vraddhn -int8x8_t vraddhn_s16(int16x8_t a, int16x8_t b); // VRADDHN.I16 d0,q0,q0 -int16x4_t vraddhn_s32(int32x4_t a, int32x4_t b); // VRADDHN.I32 d0,q0,q0 -int32x2_t vraddhn_s64(int64x2_t a, int64x2_t b); // VRADDHN.I64 d0,q0,q0 -uint8x8_t vraddhn_u16(uint16x8_t a, uint16x8_t b); // VRADDHN.I16 d0,q0,q0 -uint16x4_t vraddhn_u32(uint32x4_t a, uint32x4_t b); // VRADDHN.I32 d0,q0,q0 -uint32x2_t vraddhn_u64(uint64x2_t a, uint64x2_t b); // VRADDHN.I64 d0,q0,q0 -//Multiplication -//Vector multiply: vmul -> Vr[i] := Va[i] * Vb[i] -int8x8_t vmul_s8(int8x8_t a, int8x8_t b); // VMUL.I8 d0,d0,d0 -int16x4_t vmul_s16(int16x4_t a, int16x4_t b); // VMUL.I16 d0,d0,d0 -int32x2_t vmul_s32(int32x2_t a, int32x2_t b); // VMUL.I32 d0,d0,d0 -float32x2_t vmul_f32(float32x2_t a, float32x2_t b); // VMUL.F32 d0,d0,d0 -uint8x8_t vmul_u8(uint8x8_t a, uint8x8_t b); // VMUL.I8 d0,d0,d0 -uint16x4_t vmul_u16(uint16x4_t a, uint16x4_t b); // VMUL.I16 d0,d0,d0 -uint32x2_t vmul_u32(uint32x2_t a, uint32x2_t b); // VMUL.I32 d0,d0,d0 -poly8x8_t vmul_p8(poly8x8_t a, poly8x8_t b); // VMUL.P8 d0,d0,d0 -int8x16_t vmulq_s8(int8x16_t a, int8x16_t b); // VMUL.I8 q0,q0,q0 -int16x8_t vmulq_s16(int16x8_t a, int16x8_t b); // VMUL.I16 q0,q0,q0 -int32x4_t vmulq_s32(int32x4_t a, int32x4_t b); // VMUL.I32 q0,q0,q0 -float32x4_t vmulq_f32(float32x4_t a, float32x4_t b); // VMUL.F32 q0,q0,q0 -uint8x16_t vmulq_u8(uint8x16_t a, uint8x16_t b); // VMUL.I8 q0,q0,q0 -uint16x8_t vmulq_u16(uint16x8_t a, uint16x8_t b); // VMUL.I16 q0,q0,q0 -uint32x4_t vmulq_u32(uint32x4_t a, uint32x4_t b); // VMUL.I32 q0,q0,q0 -poly8x16_t vmulq_p8(poly8x16_t a, poly8x16_t b); // VMUL.P8 q0,q0,q0 -//multiply lane -int16x4_t vmul_lane_s16 (int16x4_t a, int16x4_t b, __constrange(0,3) int c); -int32x2_t vmul_lane_s32 (int32x2_t a, int32x2_t b, __constrange(0,1) int c); -float32x2_t vmul_lane_f32 (float32x2_t a, float32x2_t b, __constrange(0,1) int c); -uint16x4_t vmul_lane_u16 (uint16x4_t a, uint16x4_t b, __constrange(0,3) int c); -uint32x2_t vmul_lane_u32 (uint32x2_t a, uint32x2_t b, __constrange(0,1) int c); -int16x8_t vmulq_lane_s16 (int16x8_t a, int16x4_t b, __constrange(0,3) int c); -int32x4_t vmulq_lane_s32 (int32x4_t a, int32x2_t b, __constrange(0,1) int c); -float32x4_t vmulq_lane_f32 (float32x4_t a, float32x2_t b, __constrange(0,1) int c); -uint16x8_t vmulq_lane_u16 (uint16x8_t a, uint16x4_t b, __constrange(0,3) int c); -uint32x4_t vmulq_lane_u32 (uint32x4_t a, uint32x2_t b, __constrange(0,1) int c); -//Vector multiply accumulate: vmla -> Vr[i] := Va[i] + Vb[i] * Vc[i] -int8x8_t vmla_s8(int8x8_t a, int8x8_t b, int8x8_t c); // VMLA.I8 d0,d0,d0 -int16x4_t vmla_s16(int16x4_t a, int16x4_t b, int16x4_t c); // VMLA.I16 d0,d0,d0 -int32x2_t vmla_s32(int32x2_t a, int32x2_t b, int32x2_t c); // VMLA.I32 d0,d0,d0 -float32x2_t vmla_f32(float32x2_t a, float32x2_t b, float32x2_t c); // VMLA.F32 d0,d0,d0 -uint8x8_t vmla_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c); // VMLA.I8 d0,d0,d0 -uint16x4_t vmla_u16(uint16x4_t a, uint16x4_t b, uint16x4_t c); // VMLA.I16 d0,d0,d0 -uint32x2_t vmla_u32(uint32x2_t a, uint32x2_t b, uint32x2_t c); // VMLA.I32 d0,d0,d0 -int8x16_t vmlaq_s8(int8x16_t a, int8x16_t b, int8x16_t c); // VMLA.I8 q0,q0,q0 -int16x8_t vmlaq_s16(int16x8_t a, int16x8_t b, int16x8_t c); // VMLA.I16 q0,q0,q0 -int32x4_t vmlaq_s32(int32x4_t a, int32x4_t b, int32x4_t c); // VMLA.I32 q0,q0,q0 -float32x4_t vmlaq_f32(float32x4_t a, float32x4_t b, float32x4_t c); // VMLA.F32 q0,q0,q0 -uint8x16_t vmlaq_u8(uint8x16_t a, uint8x16_t b, uint8x16_t c); // VMLA.I8 q0,q0,q0 -uint16x8_t vmlaq_u16(uint16x8_t a, uint16x8_t b, uint16x8_t c); // VMLA.I16 q0,q0,q0 -uint32x4_t vmlaq_u32(uint32x4_t a, uint32x4_t b, uint32x4_t c); // VMLA.I32 q0,q0,q0 -//Vector multiply accumulate long: vmlal -> Vr[i] := Va[i] + Vb[i] * Vc[i] -int16x8_t vmlal_s8(int16x8_t a, int8x8_t b, int8x8_t c); // VMLAL.S8 q0,d0,d0 -int32x4_t vmlal_s16(int32x4_t a, int16x4_t b, int16x4_t c); // VMLAL.S16 q0,d0,d0 -int64x2_t vmlal_s32(int64x2_t a, int32x2_t b, int32x2_t c); // VMLAL.S32 q0,d0,d0 -uint16x8_t vmlal_u8(uint16x8_t a, uint8x8_t b, uint8x8_t c); // VMLAL.U8 q0,d0,d0 -uint32x4_t vmlal_u16(uint32x4_t a, uint16x4_t b, uint16x4_t c); // VMLAL.U16 q0,d0,d0 -uint64x2_t vmlal_u32(uint64x2_t a, uint32x2_t b, uint32x2_t c); // VMLAL.U32 q0,d0,d0 -//Vector multiply subtract: vmls -> Vr[i] := Va[i] - Vb[i] * Vc[i] -int8x8_t vmls_s8(int8x8_t a, int8x8_t b, int8x8_t c); // VMLS.I8 d0,d0,d0 -int16x4_t vmls_s16(int16x4_t a, int16x4_t b, int16x4_t c); // VMLS.I16 d0,d0,d0 -int32x2_t vmls_s32(int32x2_t a, int32x2_t b, int32x2_t c); // VMLS.I32 d0,d0,d0 -float32x2_t vmls_f32(float32x2_t a, float32x2_t b, float32x2_t c); // VMLS.F32 d0,d0,d0 -uint8x8_t vmls_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c); // VMLS.I8 d0,d0,d0 -uint16x4_t vmls_u16(uint16x4_t a, uint16x4_t b, uint16x4_t c); // VMLS.I16 d0,d0,d0 -uint32x2_t vmls_u32(uint32x2_t a, uint32x2_t b, uint32x2_t c); // VMLS.I32 d0,d0,d0 -int8x16_t vmlsq_s8(int8x16_t a, int8x16_t b, int8x16_t c); // VMLS.I8 q0,q0,q0 -int16x8_t vmlsq_s16(int16x8_t a, int16x8_t b, int16x8_t c); // VMLS.I16 q0,q0,q0 -int32x4_t vmlsq_s32(int32x4_t a, int32x4_t b, int32x4_t c); // VMLS.I32 q0,q0,q0 -float32x4_t vmlsq_f32(float32x4_t a, float32x4_t b, float32x4_t c); // VMLS.F32 q0,q0,q0 -uint8x16_t vmlsq_u8(uint8x16_t a, uint8x16_t b, uint8x16_t c); // VMLS.I8 q0,q0,q0 -uint16x8_t vmlsq_u16(uint16x8_t a, uint16x8_t b, uint16x8_t c); // VMLS.I16 q0,q0,q0 -uint32x4_t vmlsq_u32(uint32x4_t a, uint32x4_t b, uint32x4_t c); // VMLS.I32 q0,q0,q0 -//Vector multiply subtract long -int16x8_t vmlsl_s8(int16x8_t a, int8x8_t b, int8x8_t c); // VMLSL.S8 q0,d0,d0 -int32x4_t vmlsl_s16(int32x4_t a, int16x4_t b, int16x4_t c); // VMLSL.S16 q0,d0,d0 -int64x2_t vmlsl_s32(int64x2_t a, int32x2_t b, int32x2_t c); // VMLSL.S32 q0,d0,d0 -uint16x8_t vmlsl_u8(uint16x8_t a, uint8x8_t b, uint8x8_t c); // VMLSL.U8 q0,d0,d0 -uint32x4_t vmlsl_u16(uint32x4_t a, uint16x4_t b, uint16x4_t c); // VMLSL.U16 q0,d0,d0 -uint64x2_t vmlsl_u32(uint64x2_t a, uint32x2_t b, uint32x2_t c); // VMLSL.U32 q0,d0,d0 -//Vector saturating doubling multiply high -int16x4_t vqdmulh_s16(int16x4_t a, int16x4_t b); // VQDMULH.S16 d0,d0,d0 -int32x2_t vqdmulh_s32(int32x2_t a, int32x2_t b); // VQDMULH.S32 d0,d0,d0 -int16x8_t vqdmulhq_s16(int16x8_t a, int16x8_t b); // VQDMULH.S16 q0,q0,q0 -int32x4_t vqdmulhq_s32(int32x4_t a, int32x4_t b); // VQDMULH.S32 q0,q0,q0 -//Vector saturating rounding doubling multiply high -int16x4_t vqrdmulh_s16(int16x4_t a, int16x4_t b); // VQRDMULH.S16 d0,d0,d0 -int32x2_t vqrdmulh_s32(int32x2_t a, int32x2_t b); // VQRDMULH.S32 d0,d0,d0 -int16x8_t vqrdmulhq_s16(int16x8_t a, int16x8_t b); // VQRDMULH.S16 q0,q0,q0 -int32x4_t vqrdmulhq_s32(int32x4_t a, int32x4_t b); // VQRDMULH.S32 q0,q0,q0 -//Vector saturating doubling multiply accumulate long -int32x4_t vqdmlal_s16(int32x4_t a, int16x4_t b, int16x4_t c); // VQDMLAL.S16 q0,d0,d0 -int64x2_t vqdmlal_s32(int64x2_t a, int32x2_t b, int32x2_t c); // VQDMLAL.S32 q0,d0,d0 -//Vector saturating doubling multiply subtract long -int32x4_t vqdmlsl_s16(int32x4_t a, int16x4_t b, int16x4_t c); // VQDMLSL.S16 q0,d0,d0 -int64x2_t vqdmlsl_s32(int64x2_t a, int32x2_t b, int32x2_t c); // VQDMLSL.S32 q0,d0,d0 -//Vector long multiply -int16x8_t vmull_s8(int8x8_t a, int8x8_t b); // VMULL.S8 q0,d0,d0 -int32x4_t vmull_s16(int16x4_t a, int16x4_t b); // VMULL.S16 q0,d0,d0 -int64x2_t vmull_s32(int32x2_t a, int32x2_t b); // VMULL.S32 q0,d0,d0 -uint16x8_t vmull_u8(uint8x8_t a, uint8x8_t b); // VMULL.U8 q0,d0,d0 -uint32x4_t vmull_u16(uint16x4_t a, uint16x4_t b); // VMULL.U16 q0,d0,d0 -uint64x2_t vmull_u32(uint32x2_t a, uint32x2_t b); // VMULL.U32 q0,d0,d0 -poly16x8_t vmull_p8(poly8x8_t a, poly8x8_t b); // VMULL.P8 q0,d0,d0 -//Vector saturating doubling long multiply -int32x4_t vqdmull_s16(int16x4_t a, int16x4_t b); // VQDMULL.S16 q0,d0,d0 -int64x2_t vqdmull_s32(int32x2_t a, int32x2_t b); // VQDMULL.S32 q0,d0,d0 -//Subtraction -//Vector subtract -int8x8_t vsub_s8(int8x8_t a, int8x8_t b); // VSUB.I8 d0,d0,d0 -int16x4_t vsub_s16(int16x4_t a, int16x4_t b); // VSUB.I16 d0,d0,d0 -int32x2_t vsub_s32(int32x2_t a, int32x2_t b); // VSUB.I32 d0,d0,d0 -int64x1_t vsub_s64(int64x1_t a, int64x1_t b); // VSUB.I64 d0,d0,d0 -float32x2_t vsub_f32(float32x2_t a, float32x2_t b); // VSUB.F32 d0,d0,d0 -uint8x8_t vsub_u8(uint8x8_t a, uint8x8_t b); // VSUB.I8 d0,d0,d0 -uint16x4_t vsub_u16(uint16x4_t a, uint16x4_t b); // VSUB.I16 d0,d0,d0 -uint32x2_t vsub_u32(uint32x2_t a, uint32x2_t b); // VSUB.I32 d0,d0,d0 -uint64x1_t vsub_u64(uint64x1_t a, uint64x1_t b); // VSUB.I64 d0,d0,d0 -int8x16_t vsubq_s8(int8x16_t a, int8x16_t b); // VSUB.I8 q0,q0,q0 -int16x8_t vsubq_s16(int16x8_t a, int16x8_t b); // VSUB.I16 q0,q0,q0 -int32x4_t vsubq_s32(int32x4_t a, int32x4_t b); // VSUB.I32 q0,q0,q0 -int64x2_t vsubq_s64(int64x2_t a, int64x2_t b); // VSUB.I64 q0,q0,q0 -float32x4_t vsubq_f32(float32x4_t a, float32x4_t b); // VSUB.F32 q0,q0,q0 -uint8x16_t vsubq_u8(uint8x16_t a, uint8x16_t b); // VSUB.I8 q0,q0,q0 -uint16x8_t vsubq_u16(uint16x8_t a, uint16x8_t b); // VSUB.I16 q0,q0,q0 -uint32x4_t vsubq_u32(uint32x4_t a, uint32x4_t b); // VSUB.I32 q0,q0,q0 -uint64x2_t vsubq_u64(uint64x2_t a, uint64x2_t b); // VSUB.I64 q0,q0,q0 -//Vector long subtract: vsub -> Vr[i]:=Va[i]+Vb[i] -int16x8_t vsubl_s8(int8x8_t a, int8x8_t b); // VSUBL.S8 q0,d0,d0 -int32x4_t vsubl_s16(int16x4_t a, int16x4_t b); // VSUBL.S16 q0,d0,d0 -int64x2_t vsubl_s32(int32x2_t a, int32x2_t b); // VSUBL.S32 q0,d0,d0 -uint16x8_t vsubl_u8(uint8x8_t a, uint8x8_t b); // VSUBL.U8 q0,d0,d0 -uint32x4_t vsubl_u16(uint16x4_t a, uint16x4_t b); // VSUBL.U16 q0,d0,d0 -uint64x2_t vsubl_u32(uint32x2_t a, uint32x2_t b); // VSUBL.U32 q0,d0,d0 -//Vector wide subtract: vsub -> Vr[i]:=Va[i]+Vb[i] -int16x8_t vsubw_s8(int16x8_t a, int8x8_t b); // VSUBW.S8 q0,q0,d0 -int32x4_t vsubw_s16(int32x4_t a, int16x4_t b); // VSUBW.S16 q0,q0,d0 -int64x2_t vsubw_s32(int64x2_t a, int32x2_t b); // VSUBW.S32 q0,q0,d0 -uint16x8_t vsubw_u8(uint16x8_t a, uint8x8_t b); // VSUBW.U8 q0,q0,d0 -uint32x4_t vsubw_u16(uint32x4_t a, uint16x4_t b); // VSUBW.U16 q0,q0,d0 -uint64x2_t vsubw_u32(uint64x2_t a, uint32x2_t b); // VSUBW.U32 q0,q0,d0 -//Vector saturating subtract -int8x8_t vqsub_s8(int8x8_t a, int8x8_t b); // VQSUB.S8 d0,d0,d0 -int16x4_t vqsub_s16(int16x4_t a, int16x4_t b); // VQSUB.S16 d0,d0,d0 -int32x2_t vqsub_s32(int32x2_t a, int32x2_t b); // VQSUB.S32 d0,d0,d0 -int64x1_t vqsub_s64(int64x1_t a, int64x1_t b); // VQSUB.S64 d0,d0,d0 -uint8x8_t vqsub_u8(uint8x8_t a, uint8x8_t b); // VQSUB.U8 d0,d0,d0 -uint16x4_t vqsub_u16(uint16x4_t a, uint16x4_t b); // VQSUB.U16 d0,d0,d0 -uint32x2_t vqsub_u32(uint32x2_t a, uint32x2_t b); // VQSUB.U32 d0,d0,d0 -uint64x1_t vqsub_u64(uint64x1_t a, uint64x1_t b); // VQSUB.U64 d0,d0,d0 -int8x16_t vqsubq_s8(int8x16_t a, int8x16_t b); // VQSUB.S8 q0,q0,q0 -int16x8_t vqsubq_s16(int16x8_t a, int16x8_t b); // VQSUB.S16 q0,q0,q0 -int32x4_t vqsubq_s32(int32x4_t a, int32x4_t b); // VQSUB.S32 q0,q0,q0 -int64x2_t vqsubq_s64(int64x2_t a, int64x2_t b); // VQSUB.S64 q0,q0,q0 -uint8x16_t vqsubq_u8(uint8x16_t a, uint8x16_t b); // VQSUB.U8 q0,q0,q0 -uint16x8_t vqsubq_u16(uint16x8_t a, uint16x8_t b); // VQSUB.U16 q0,q0,q0 -uint32x4_t vqsubq_u32(uint32x4_t a, uint32x4_t b); // VQSUB.U32 q0,q0,q0 -uint64x2_t vqsubq_u64(uint64x2_t a, uint64x2_t b); // VQSUB.U64 q0,q0,q0 -//Vector halving subtract -int8x8_t vhsub_s8(int8x8_t a, int8x8_t b); // VHSUB.S8 d0,d0,d0 -int16x4_t vhsub_s16(int16x4_t a, int16x4_t b); // VHSUB.S16 d0,d0,d0 -int32x2_t vhsub_s32(int32x2_t a, int32x2_t b); // VHSUB.S32 d0,d0,d0 -uint8x8_t vhsub_u8(uint8x8_t a, uint8x8_t b); // VHSUB.U8 d0,d0,d0 -uint16x4_t vhsub_u16(uint16x4_t a, uint16x4_t b); // VHSUB.U16 d0,d0,d0 -uint32x2_t vhsub_u32(uint32x2_t a, uint32x2_t b); // VHSUB.U32 d0,d0,d0 -int8x16_t vhsubq_s8(int8x16_t a, int8x16_t b); // VHSUB.S8 q0,q0,q0 -int16x8_t vhsubq_s16(int16x8_t a, int16x8_t b); // VHSUB.S16 q0,q0,q0 -int32x4_t vhsubq_s32(int32x4_t a, int32x4_t b); // VHSUB.S32 q0,q0,q0 -uint8x16_t vhsubq_u8(uint8x16_t a, uint8x16_t b); // VHSUB.U8 q0,q0,q0 -uint16x8_t vhsubq_u16(uint16x8_t a, uint16x8_t b); // VHSUB.U16 q0,q0,q0 -uint32x4_t vhsubq_u32(uint32x4_t a, uint32x4_t b); // VHSUB.U32 q0,q0,q0 -//Vector subtract high half -int8x8_t vsubhn_s16(int16x8_t a, int16x8_t b); // VSUBHN.I16 d0,q0,q0 -int16x4_t vsubhn_s32(int32x4_t a, int32x4_t b); // VSUBHN.I32 d0,q0,q0 -int32x2_t vsubhn_s64(int64x2_t a, int64x2_t b); // VSUBHN.I64 d0,q0,q0 -uint8x8_t vsubhn_u16(uint16x8_t a, uint16x8_t b); // VSUBHN.I16 d0,q0,q0 -uint16x4_t vsubhn_u32(uint32x4_t a, uint32x4_t b); // VSUBHN.I32 d0,q0,q0 -uint32x2_t vsubhn_u64(uint64x2_t a, uint64x2_t b); // VSUBHN.I64 d0,q0,q0 -//Vector rounding subtract high half -int8x8_t vrsubhn_s16(int16x8_t a, int16x8_t b); // VRSUBHN.I16 d0,q0,q0 -int16x4_t vrsubhn_s32(int32x4_t a, int32x4_t b); // VRSUBHN.I32 d0,q0,q0 -int32x2_t vrsubhn_s64(int64x2_t a, int64x2_t b); // VRSUBHN.I64 d0,q0,q0 -uint8x8_t vrsubhn_u16(uint16x8_t a, uint16x8_t b); // VRSUBHN.I16 d0,q0,q0 -uint16x4_t vrsubhn_u32(uint32x4_t a, uint32x4_t b); // VRSUBHN.I32 d0,q0,q0 -uint32x2_t vrsubhn_u64(uint64x2_t a, uint64x2_t b); // VRSUBHN.I64 d0,q0,q0 -//Comparison -//Vector compare equal -uint8x8_t vceq_s8(int8x8_t a, int8x8_t b); // VCEQ.I8 d0, d0, d0 -uint16x4_t vceq_s16(int16x4_t a, int16x4_t b); // VCEQ.I16 d0, d0, d0 -uint32x2_t vceq_s32(int32x2_t a, int32x2_t b); // VCEQ.I32 d0, d0, d0 -uint32x2_t vceq_f32(float32x2_t a, float32x2_t b); // VCEQ.F32 d0, d0, d0 -uint8x8_t vceq_u8(uint8x8_t a, uint8x8_t b); // VCEQ.I8 d0, d0, d0 -uint16x4_t vceq_u16(uint16x4_t a, uint16x4_t b); // VCEQ.I16 d0, d0, d0 -uint32x2_t vceq_u32(uint32x2_t a, uint32x2_t b); // VCEQ.I32 d0, d0, d0 -uint8x8_t vceq_p8(poly8x8_t a, poly8x8_t b); // VCEQ.I8 d0, d0, d0 -uint8x16_t vceqq_s8(int8x16_t a, int8x16_t b); // VCEQ.I8 q0, q0, q0 -uint16x8_t vceqq_s16(int16x8_t a, int16x8_t b); // VCEQ.I16 q0, q0, q0 -uint32x4_t vceqq_s32(int32x4_t a, int32x4_t b); // VCEQ.I32 q0, q0, q0 -uint32x4_t vceqq_f32(float32x4_t a, float32x4_t b); // VCEQ.F32 q0, q0, q0 -uint8x16_t vceqq_u8(uint8x16_t a, uint8x16_t b); // VCEQ.I8 q0, q0, q0 -uint16x8_t vceqq_u16(uint16x8_t a, uint16x8_t b); // VCEQ.I16 q0, q0, q0 -uint32x4_t vceqq_u32(uint32x4_t a, uint32x4_t b); // VCEQ.I32 q0, q0, q0 -uint8x16_t vceqq_p8(poly8x16_t a, poly8x16_t b); // VCEQ.I8 q0, q0, q0 -//Vector compare greater-than or equal -uint8x8_t vcge_s8(int8x8_t a, int8x8_t b); // VCGE.S8 d0, d0, d0 -uint16x4_t vcge_s16(int16x4_t a, int16x4_t b); // VCGE.S16 d0, d0, d0 -uint32x2_t vcge_s32(int32x2_t a, int32x2_t b); // VCGE.S32 d0, d0, d0 -uint32x2_t vcge_f32(float32x2_t a, float32x2_t b); // VCGE.F32 d0, d0, d0 -uint8x8_t vcge_u8(uint8x8_t a, uint8x8_t b); // VCGE.U8 d0, d0, d0 -uint16x4_t vcge_u16(uint16x4_t a, uint16x4_t b); // VCGE.U16 d0, d0, d0 -uint32x2_t vcge_u32(uint32x2_t a, uint32x2_t b); // VCGE.U32 d0, d0, d0 -uint8x16_t vcgeq_s8(int8x16_t a, int8x16_t b); // VCGE.S8 q0, q0, q0 -uint16x8_t vcgeq_s16(int16x8_t a, int16x8_t b); // VCGE.S16 q0, q0, q0 -uint32x4_t vcgeq_s32(int32x4_t a, int32x4_t b); // VCGE.S32 q0, q0, q0 -uint32x4_t vcgeq_f32(float32x4_t a, float32x4_t b); // VCGE.F32 q0, q0, q0 -uint8x16_t vcgeq_u8(uint8x16_t a, uint8x16_t b); // VCGE.U8 q0, q0, q0 -uint16x8_t vcgeq_u16(uint16x8_t a, uint16x8_t b); // VCGE.U16 q0, q0, q0 -uint32x4_t vcgeq_u32(uint32x4_t a, uint32x4_t b); // VCGE.U32 q0, q0, q0 -//Vector compare less-than or equal -uint8x8_t vcle_s8(int8x8_t a, int8x8_t b); // VCGE.S8 d0, d0, d0 -uint16x4_t vcle_s16(int16x4_t a, int16x4_t b); // VCGE.S16 d0, d0, d0 -uint32x2_t vcle_s32(int32x2_t a, int32x2_t b); // VCGE.S32 d0, d0, d0 -uint32x2_t vcle_f32(float32x2_t a, float32x2_t b); // VCGE.F32 d0, d0, d0 -uint8x8_t vcle_u8(uint8x8_t a, uint8x8_t b); // VCGE.U8 d0, d0, d0 -uint16x4_t vcle_u16(uint16x4_t a, uint16x4_t b); // VCGE.U16 d0, d0, d0 -uint32x2_t vcle_u32(uint32x2_t a, uint32x2_t b); // VCGE.U32 d0, d0, d0 -uint8x16_t vcleq_s8(int8x16_t a, int8x16_t b); // VCGE.S8 q0, q0, q0 -uint16x8_t vcleq_s16(int16x8_t a, int16x8_t b); // VCGE.S16 q0, q0, q0 -uint32x4_t vcleq_s32(int32x4_t a, int32x4_t b); // VCGE.S32 q0, q0, q0 -uint32x4_t vcleq_f32(float32x4_t a, float32x4_t b); // VCGE.F32 q0, q0, q0 -uint8x16_t vcleq_u8(uint8x16_t a, uint8x16_t b); // VCGE.U8 q0, q0, q0 -uint16x8_t vcleq_u16(uint16x8_t a, uint16x8_t b); // VCGE.U16 q0, q0, q0 -uint32x4_t vcleq_u32(uint32x4_t a, uint32x4_t b); // VCGE.U32 q0, q0, q0 -//Vector compare greater-than -uint8x8_t vcgt_s8(int8x8_t a, int8x8_t b); // VCGT.S8 d0, d0, d0 -uint16x4_t vcgt_s16(int16x4_t a, int16x4_t b); // VCGT.S16 d0, d0, d0 -uint32x2_t vcgt_s32(int32x2_t a, int32x2_t b); // VCGT.S32 d0, d0, d0 -uint32x2_t vcgt_f32(float32x2_t a, float32x2_t b); // VCGT.F32 d0, d0, d0 -uint8x8_t vcgt_u8(uint8x8_t a, uint8x8_t b); // VCGT.U8 d0, d0, d0 -uint16x4_t vcgt_u16(uint16x4_t a, uint16x4_t b); // VCGT.U16 d0, d0, d0 -uint32x2_t vcgt_u32(uint32x2_t a, uint32x2_t b); // VCGT.U32 d0, d0, d0 -uint8x16_t vcgtq_s8(int8x16_t a, int8x16_t b); // VCGT.S8 q0, q0, q0 -uint16x8_t vcgtq_s16(int16x8_t a, int16x8_t b); // VCGT.S16 q0, q0, q0 -uint32x4_t vcgtq_s32(int32x4_t a, int32x4_t b); // VCGT.S32 q0, q0, q0 -uint32x4_t vcgtq_f32(float32x4_t a, float32x4_t b); // VCGT.F32 q0, q0, q0 -uint8x16_t vcgtq_u8(uint8x16_t a, uint8x16_t b); // VCGT.U8 q0, q0, q0 -uint16x8_t vcgtq_u16(uint16x8_t a, uint16x8_t b); // VCGT.U16 q0, q0, q0 -uint32x4_t vcgtq_u32(uint32x4_t a, uint32x4_t b); // VCGT.U32 q0, q0, q0 -//Vector compare less-than -uint8x8_t vclt_s8(int8x8_t a, int8x8_t b); // VCGT.S8 d0, d0, d0 -uint16x4_t vclt_s16(int16x4_t a, int16x4_t b); // VCGT.S16 d0, d0, d0 -uint32x2_t vclt_s32(int32x2_t a, int32x2_t b); // VCGT.S32 d0, d0, d0 -uint32x2_t vclt_f32(float32x2_t a, float32x2_t b); // VCGT.F32 d0, d0, d0 -uint8x8_t vclt_u8(uint8x8_t a, uint8x8_t b); // VCGT.U8 d0, d0, d0 -uint16x4_t vclt_u16(uint16x4_t a, uint16x4_t b); // VCGT.U16 d0, d0, d0 -uint32x2_t vclt_u32(uint32x2_t a, uint32x2_t b); // VCGT.U32 d0, d0, d0 -uint8x16_t vcltq_s8(int8x16_t a, int8x16_t b); // VCGT.S8 q0, q0, q0 -uint16x8_t vcltq_s16(int16x8_t a, int16x8_t b); // VCGT.S16 q0, q0, q0 -uint32x4_t vcltq_s32(int32x4_t a, int32x4_t b); // VCGT.S32 q0, q0, q0 -uint32x4_t vcltq_f32(float32x4_t a, float32x4_t b); // VCGT.F32 q0, q0, q0 -uint8x16_t vcltq_u8(uint8x16_t a, uint8x16_t b); // VCGT.U8 q0, q0, q0 -uint16x8_t vcltq_u16(uint16x8_t a, uint16x8_t b); // VCGT.U16 q0, q0, q0 -uint32x4_t vcltq_u32(uint32x4_t a, uint32x4_t b); // VCGT.U32 q0, q0, q0 -//Vector compare absolute greater-than or equal -uint32x2_t vcage_f32(float32x2_t a, float32x2_t b); // VACGE.F32 d0, d0, d0 -uint32x4_t vcageq_f32(float32x4_t a, float32x4_t b); // VACGE.F32 q0, q0, q0 -//Vector compare absolute less-than or equal -uint32x2_t vcale_f32(float32x2_t a, float32x2_t b); // VACGE.F32 d0, d0, d0 -uint32x4_t vcaleq_f32(float32x4_t a, float32x4_t b); // VACGE.F32 q0, q0, q0 -//Vector compare absolute greater-than -uint32x2_t vcagt_f32(float32x2_t a, float32x2_t b); // VACGT.F32 d0, d0, d0 -uint32x4_t vcagtq_f32(float32x4_t a, float32x4_t b); // VACGT.F32 q0, q0, q0 -//Vector compare absolute less-than -uint32x2_t vcalt_f32(float32x2_t a, float32x2_t b); // VACGT.F32 d0, d0, d0 -uint32x4_t vcaltq_f32(float32x4_t a, float32x4_t b); // VACGT.F32 q0, q0, q0 -//Vector test bits -uint8x8_t vtst_s8(int8x8_t a, int8x8_t b); // VTST.8 d0, d0, d0 -uint16x4_t vtst_s16(int16x4_t a, int16x4_t b); // VTST.16 d0, d0, d0 -uint32x2_t vtst_s32(int32x2_t a, int32x2_t b); // VTST.32 d0, d0, d0 -uint8x8_t vtst_u8(uint8x8_t a, uint8x8_t b); // VTST.8 d0, d0, d0 -uint16x4_t vtst_u16(uint16x4_t a, uint16x4_t b); // VTST.16 d0, d0, d0 -uint32x2_t vtst_u32(uint32x2_t a, uint32x2_t b); // VTST.32 d0, d0, d0 -uint8x8_t vtst_p8(poly8x8_t a, poly8x8_t b); // VTST.8 d0, d0, d0 -uint8x16_t vtstq_s8(int8x16_t a, int8x16_t b); // VTST.8 q0, q0, q0 -uint16x8_t vtstq_s16(int16x8_t a, int16x8_t b); // VTST.16 q0, q0, q0 -uint32x4_t vtstq_s32(int32x4_t a, int32x4_t b); // VTST.32 q0, q0, q0 -uint8x16_t vtstq_u8(uint8x16_t a, uint8x16_t b); // VTST.8 q0, q0, q0 -uint16x8_t vtstq_u16(uint16x8_t a, uint16x8_t b); // VTST.16 q0, q0, q0 -uint32x4_t vtstq_u32(uint32x4_t a, uint32x4_t b); // VTST.32 q0, q0, q0 -uint8x16_t vtstq_p8(poly8x16_t a, poly8x16_t b); // VTST.8 q0, q0, q0 -//Absolute difference -//Absolute difference between the arguments: Vr[i] = | Va[i] - Vb[i] | -int8x8_t vabd_s8(int8x8_t a, int8x8_t b); // VABD.S8 d0,d0,d0 -int16x4_t vabd_s16(int16x4_t a, int16x4_t b); // VABD.S16 d0,d0,d0 -int32x2_t vabd_s32(int32x2_t a, int32x2_t b); // VABD.S32 d0,d0,d0 -uint8x8_t vabd_u8(uint8x8_t a, uint8x8_t b); // VABD.U8 d0,d0,d0 -uint16x4_t vabd_u16(uint16x4_t a, uint16x4_t b); // VABD.U16 d0,d0,d0 -uint32x2_t vabd_u32(uint32x2_t a, uint32x2_t b); // VABD.U32 d0,d0,d0 -float32x2_t vabd_f32(float32x2_t a, float32x2_t b); // VABD.F32 d0,d0,d0 -int8x16_t vabdq_s8(int8x16_t a, int8x16_t b); // VABD.S8 q0,q0,q0 -int16x8_t vabdq_s16(int16x8_t a, int16x8_t b); // VABD.S16 q0,q0,q0 -int32x4_t vabdq_s32(int32x4_t a, int32x4_t b); // VABD.S32 q0,q0,q0 -uint8x16_t vabdq_u8(uint8x16_t a, uint8x16_t b); // VABD.U8 q0,q0,q0 -uint16x8_t vabdq_u16(uint16x8_t a, uint16x8_t b); // VABD.U16 q0,q0,q0 -uint32x4_t vabdq_u32(uint32x4_t a, uint32x4_t b); // VABD.U32 q0,q0,q0 -float32x4_t vabdq_f32(float32x4_t a, float32x4_t b); // VABD.F32 q0,q0,q0 -//Absolute difference - long -int16x8_t vabdl_s8(int8x8_t a, int8x8_t b); // VABDL.S8 q0,d0,d0 -int32x4_t vabdl_s16(int16x4_t a, int16x4_t b); // VABDL.S16 q0,d0,d0 -int64x2_t vabdl_s32(int32x2_t a, int32x2_t b); // VABDL.S32 q0,d0,d0 -uint16x8_t vabdl_u8(uint8x8_t a, uint8x8_t b); // VABDL.U8 q0,d0,d0 -uint32x4_t vabdl_u16(uint16x4_t a, uint16x4_t b); // VABDL.U16 q0,d0,d0 -uint64x2_t vabdl_u32(uint32x2_t a, uint32x2_t b); // VABDL.U32 q0,d0,d0 -//Absolute difference and accumulate: Vr[i] = Va[i] + | Vb[i] - Vc[i] | -int8x8_t vaba_s8(int8x8_t a, int8x8_t b, int8x8_t c); // VABA.S8 d0,d0,d0 -int16x4_t vaba_s16(int16x4_t a, int16x4_t b, int16x4_t c); // VABA.S16 d0,d0,d0 -int32x2_t vaba_s32(int32x2_t a, int32x2_t b, int32x2_t c); // VABA.S32 d0,d0,d0 -uint8x8_t vaba_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c); // VABA.U8 d0,d0,d0 -uint16x4_t vaba_u16(uint16x4_t a, uint16x4_t b, uint16x4_t c); // VABA.U16 d0,d0,d0 -uint32x2_t vaba_u32(uint32x2_t a, uint32x2_t b, uint32x2_t c); // VABA.U32 d0,d0,d0 -int8x16_t vabaq_s8(int8x16_t a, int8x16_t b, int8x16_t c); // VABA.S8 q0,q0,q0 -int16x8_t vabaq_s16(int16x8_t a, int16x8_t b, int16x8_t c); // VABA.S16 q0,q0,q0 -int32x4_t vabaq_s32(int32x4_t a, int32x4_t b, int32x4_t c); // VABA.S32 q0,q0,q0 -uint8x16_t vabaq_u8(uint8x16_t a, uint8x16_t b, uint8x16_t c); // VABA.U8 q0,q0,q0 -uint16x8_t vabaq_u16(uint16x8_t a, uint16x8_t b, uint16x8_t c); // VABA.U16 q0,q0,q0 -uint32x4_t vabaq_u32(uint32x4_t a, uint32x4_t b, uint32x4_t c); // VABA.U32 q0,q0,q0 -//Absolute difference and accumulate - long -int16x8_t vabal_s8(int16x8_t a, int8x8_t b, int8x8_t c); // VABAL.S8 q0,d0,d0 -int32x4_t vabal_s16(int32x4_t a, int16x4_t b, int16x4_t c); // VABAL.S16 q0,d0,d0 -int64x2_t vabal_s32(int64x2_t a, int32x2_t b, int32x2_t c); // VABAL.S32 q0,d0,d0 -uint16x8_t vabal_u8(uint16x8_t a, uint8x8_t b, uint8x8_t c); // VABAL.U8 q0,d0,d0 -uint32x4_t vabal_u16(uint32x4_t a, uint16x4_t b, uint16x4_t c); // VABAL.U16 q0,d0,d0 -uint64x2_t vabal_u32(uint64x2_t a, uint32x2_t b, uint32x2_t c); // VABAL.U32 q0,d0,d0 -//Max/Min -//vmax -> Vr[i] := (Va[i] >= Vb[i]) ? Va[i] : Vb[i] -int8x8_t vmax_s8(int8x8_t a, int8x8_t b); // VMAX.S8 d0,d0,d0 -int16x4_t vmax_s16(int16x4_t a, int16x4_t b); // VMAX.S16 d0,d0,d0 -int32x2_t vmax_s32(int32x2_t a, int32x2_t b); // VMAX.S32 d0,d0,d0 -uint8x8_t vmax_u8(uint8x8_t a, uint8x8_t b); // VMAX.U8 d0,d0,d0 -uint16x4_t vmax_u16(uint16x4_t a, uint16x4_t b); // VMAX.U16 d0,d0,d0 -uint32x2_t vmax_u32(uint32x2_t a, uint32x2_t b); // VMAX.U32 d0,d0,d0 -float32x2_t vmax_f32(float32x2_t a, float32x2_t b); // VMAX.F32 d0,d0,d0 -int8x16_t vmaxq_s8(int8x16_t a, int8x16_t b); // VMAX.S8 q0,q0,q0 -int16x8_t vmaxq_s16(int16x8_t a, int16x8_t b); // VMAX.S16 q0,q0,q0 -int32x4_t vmaxq_s32(int32x4_t a, int32x4_t b); // VMAX.S32 q0,q0,q0 -uint8x16_t vmaxq_u8(uint8x16_t a, uint8x16_t b); // VMAX.U8 q0,q0,q0 -uint16x8_t vmaxq_u16(uint16x8_t a, uint16x8_t b); // VMAX.U16 q0,q0,q0 -uint32x4_t vmaxq_u32(uint32x4_t a, uint32x4_t b); // VMAX.U32 q0,q0,q0 -float32x4_t vmaxq_f32(float32x4_t a, float32x4_t b); // VMAX.F32 q0,q0,q0 -//vmin -> Vr[i] := (Va[i] >= Vb[i]) ? Vb[i] : Va[i] -int8x8_t vmin_s8(int8x8_t a, int8x8_t b); // VMIN.S8 d0,d0,d0 -int16x4_t vmin_s16(int16x4_t a, int16x4_t b); // VMIN.S16 d0,d0,d0 -int32x2_t vmin_s32(int32x2_t a, int32x2_t b); // VMIN.S32 d0,d0,d0 -uint8x8_t vmin_u8(uint8x8_t a, uint8x8_t b); // VMIN.U8 d0,d0,d0 -uint16x4_t vmin_u16(uint16x4_t a, uint16x4_t b); // VMIN.U16 d0,d0,d0 -uint32x2_t vmin_u32(uint32x2_t a, uint32x2_t b); // VMIN.U32 d0,d0,d0 -float32x2_t vmin_f32(float32x2_t a, float32x2_t b); // VMIN.F32 d0,d0,d0 -int8x16_t vminq_s8(int8x16_t a, int8x16_t b); // VMIN.S8 q0,q0,q0 -int16x8_t vminq_s16(int16x8_t a, int16x8_t b); // VMIN.S16 q0,q0,q0 -int32x4_t vminq_s32(int32x4_t a, int32x4_t b); // VMIN.S32 q0,q0,q0 -uint8x16_t vminq_u8(uint8x16_t a, uint8x16_t b); // VMIN.U8 q0,q0,q0 -uint16x8_t vminq_u16(uint16x8_t a, uint16x8_t b); // VMIN.U16 q0,q0,q0 -uint32x4_t vminq_u32(uint32x4_t a, uint32x4_t b); // VMIN.U32 q0,q0,q0 -float32x4_t vminq_f32(float32x4_t a, float32x4_t b); // VMIN.F32 q0,q0,q0 -//Pairwise addition -//Pairwise add -int8x8_t vpadd_s8(int8x8_t a, int8x8_t b); // VPADD.I8 d0,d0,d0 -int16x4_t vpadd_s16(int16x4_t a, int16x4_t b); // VPADD.I16 d0,d0,d0 -int32x2_t vpadd_s32(int32x2_t a, int32x2_t b); // VPADD.I32 d0,d0,d0 -uint8x8_t vpadd_u8(uint8x8_t a, uint8x8_t b); // VPADD.I8 d0,d0,d0 -uint16x4_t vpadd_u16(uint16x4_t a, uint16x4_t b); // VPADD.I16 d0,d0,d0 -uint32x2_t vpadd_u32(uint32x2_t a, uint32x2_t b); // VPADD.I32 d0,d0,d0 -float32x2_t vpadd_f32(float32x2_t a, float32x2_t b); // VPADD.F32 d0,d0,d0 -//Long pairwise add -int16x4_t vpaddl_s8(int8x8_t a); // VPADDL.S8 d0,d0 -int32x2_t vpaddl_s16(int16x4_t a); // VPADDL.S16 d0,d0 -int64x1_t vpaddl_s32(int32x2_t a); // VPADDL.S32 d0,d0 -uint16x4_t vpaddl_u8(uint8x8_t a); // VPADDL.U8 d0,d0 -uint32x2_t vpaddl_u16(uint16x4_t a); // VPADDL.U16 d0,d0 -uint64x1_t vpaddl_u32(uint32x2_t a); // VPADDL.U32 d0,d0 -int16x8_t vpaddlq_s8(int8x16_t a); // VPADDL.S8 q0,q0 -int32x4_t vpaddlq_s16(int16x8_t a); // VPADDL.S16 q0,q0 -int64x2_t vpaddlq_s32(int32x4_t a); // VPADDL.S32 q0,q0 -uint16x8_t vpaddlq_u8(uint8x16_t a); // VPADDL.U8 q0,q0 -uint32x4_t vpaddlq_u16(uint16x8_t a); // VPADDL.U16 q0,q0 -uint64x2_t vpaddlq_u32(uint32x4_t a); // VPADDL.U32 q0,q0 -//Long pairwise add and accumulate -int16x4_t vpadal_s8(int16x4_t a, int8x8_t b); // VPADAL.S8 d0,d0 -int32x2_t vpadal_s16(int32x2_t a, int16x4_t b); // VPADAL.S16 d0,d0 -int64x1_t vpadal_s32(int64x1_t a, int32x2_t b); // VPADAL.S32 d0,d0 -uint16x4_t vpadal_u8(uint16x4_t a, uint8x8_t b); // VPADAL.U8 d0,d0 -uint32x2_t vpadal_u16(uint32x2_t a, uint16x4_t b); // VPADAL.U16 d0,d0 -uint64x1_t vpadal_u32(uint64x1_t a, uint32x2_t b); // VPADAL.U32 d0,d0 -int16x8_t vpadalq_s8(int16x8_t a, int8x16_t b); // VPADAL.S8 q0,q0 -int32x4_t vpadalq_s16(int32x4_t a, int16x8_t b); // VPADAL.S16 q0,q0 -int64x2_t vpadalq_s32(int64x2_t a, int32x4_t b); // VPADAL.S32 q0,q0 -uint16x8_t vpadalq_u8(uint16x8_t a, uint8x16_t b); // VPADAL.U8 q0,q0 -uint32x4_t vpadalq_u16(uint32x4_t a, uint16x8_t b); // VPADAL.U16 q0,q0 -uint64x2_t vpadalq_u32(uint64x2_t a, uint32x4_t b); // VPADAL.U32 q0,q0 -//Folding maximum vpmax -> takes maximum of adjacent pairs -int8x8_t vpmax_s8(int8x8_t a, int8x8_t b); // VPMAX.S8 d0,d0,d0 -int16x4_t vpmax_s16(int16x4_t a, int16x4_t b); // VPMAX.S16 d0,d0,d0 -int32x2_t vpmax_s32(int32x2_t a, int32x2_t b); // VPMAX.S32 d0,d0,d0 -uint8x8_t vpmax_u8(uint8x8_t a, uint8x8_t b); // VPMAX.U8 d0,d0,d0 -uint16x4_t vpmax_u16(uint16x4_t a, uint16x4_t b); // VPMAX.U16 d0,d0,d0 -uint32x2_t vpmax_u32(uint32x2_t a, uint32x2_t b); // VPMAX.U32 d0,d0,d0 -float32x2_t vpmax_f32(float32x2_t a, float32x2_t b); // VPMAX.F32 d0,d0,d0 -//Folding minimum vpmin -> takes minimum of adjacent pairs -int8x8_t vpmin_s8(int8x8_t a, int8x8_t b); // VPMIN.S8 d0,d0,d0 -int16x4_t vpmin_s16(int16x4_t a, int16x4_t b); // VPMIN.S16 d0,d0,d0 -int32x2_t vpmin_s32(int32x2_t a, int32x2_t b); // VPMIN.S32 d0,d0,d0 -uint8x8_t vpmin_u8(uint8x8_t a, uint8x8_t b); // VPMIN.U8 d0,d0,d0 -uint16x4_t vpmin_u16(uint16x4_t a, uint16x4_t b); // VPMIN.U16 d0,d0,d0 -uint32x2_t vpmin_u32(uint32x2_t a, uint32x2_t b); // VPMIN.U32 d0,d0,d0 -float32x2_t vpmin_f32(float32x2_t a, float32x2_t b); // VPMIN.F32 d0,d0,d0 -//Reciprocal/Sqrt -float32x2_t vrecps_f32(float32x2_t a, float32x2_t b); // VRECPS.F32 d0, d0, d0 -float32x4_t vrecpsq_f32(float32x4_t a, float32x4_t b); // VRECPS.F32 q0, q0, q0 -float32x2_t vrsqrts_f32(float32x2_t a, float32x2_t b); // VRSQRTS.F32 d0, d0, d0 -float32x4_t vrsqrtsq_f32(float32x4_t a, float32x4_t b); // VRSQRTS.F32 q0, q0, q0 -//Shifts by signed variable -//Vector shift left: Vr[i] := Va[i] << Vb[i] (negative values shift right) -int8x8_t vshl_s8(int8x8_t a, int8x8_t b); // VSHL.S8 d0,d0,d0 -int16x4_t vshl_s16(int16x4_t a, int16x4_t b); // VSHL.S16 d0,d0,d0 -int32x2_t vshl_s32(int32x2_t a, int32x2_t b); // VSHL.S32 d0,d0,d0 -int64x1_t vshl_s64(int64x1_t a, int64x1_t b); // VSHL.S64 d0,d0,d0 -uint8x8_t vshl_u8(uint8x8_t a, int8x8_t b); // VSHL.U8 d0,d0,d0 -uint16x4_t vshl_u16(uint16x4_t a, int16x4_t b); // VSHL.U16 d0,d0,d0 -uint32x2_t vshl_u32(uint32x2_t a, int32x2_t b); // VSHL.U32 d0,d0,d0 -uint64x1_t vshl_u64(uint64x1_t a, int64x1_t b); // VSHL.U64 d0,d0,d0 -int8x16_t vshlq_s8(int8x16_t a, int8x16_t b); // VSHL.S8 q0,q0,q0 -int16x8_t vshlq_s16(int16x8_t a, int16x8_t b); // VSHL.S16 q0,q0,q0 -int32x4_t vshlq_s32(int32x4_t a, int32x4_t b); // VSHL.S32 q0,q0,q0 -int64x2_t vshlq_s64(int64x2_t a, int64x2_t b); // VSHL.S64 q0,q0,q0 -uint8x16_t vshlq_u8(uint8x16_t a, int8x16_t b); // VSHL.U8 q0,q0,q0 -uint16x8_t vshlq_u16(uint16x8_t a, int16x8_t b); // VSHL.U16 q0,q0,q0 -uint32x4_t vshlq_u32(uint32x4_t a, int32x4_t b); // VSHL.U32 q0,q0,q0 -uint64x2_t vshlq_u64(uint64x2_t a, int64x2_t b); // VSHL.U64 q0,q0,q0 -//Vector saturating shift left: (negative values shift right) -int8x8_t vqshl_s8(int8x8_t a, int8x8_t b); // VQSHL.S8 d0,d0,d0 -int16x4_t vqshl_s16(int16x4_t a, int16x4_t b); // VQSHL.S16 d0,d0,d0 -int32x2_t vqshl_s32(int32x2_t a, int32x2_t b); // VQSHL.S32 d0,d0,d0 -int64x1_t vqshl_s64(int64x1_t a, int64x1_t b); // VQSHL.S64 d0,d0,d0 -uint8x8_t vqshl_u8(uint8x8_t a, int8x8_t b); // VQSHL.U8 d0,d0,d0 -uint16x4_t vqshl_u16(uint16x4_t a, int16x4_t b); // VQSHL.U16 d0,d0,d0 -uint32x2_t vqshl_u32(uint32x2_t a, int32x2_t b); // VQSHL.U32 d0,d0,d0 -uint64x1_t vqshl_u64(uint64x1_t a, int64x1_t b); // VQSHL.U64 d0,d0,d0 -int8x16_t vqshlq_s8(int8x16_t a, int8x16_t b); // VQSHL.S8 q0,q0,q0 -int16x8_t vqshlq_s16(int16x8_t a, int16x8_t b); // VQSHL.S16 q0,q0,q0 -int32x4_t vqshlq_s32(int32x4_t a, int32x4_t b); // VQSHL.S32 q0,q0,q0 -int64x2_t vqshlq_s64(int64x2_t a, int64x2_t b); // VQSHL.S64 q0,q0,q0 -uint8x16_t vqshlq_u8(uint8x16_t a, int8x16_t b); // VQSHL.U8 q0,q0,q0 -uint16x8_t vqshlq_u16(uint16x8_t a, int16x8_t b); // VQSHL.U16 q0,q0,q0 -uint32x4_t vqshlq_u32(uint32x4_t a, int32x4_t b); // VQSHL.U32 q0,q0,q0 -uint64x2_t vqshlq_u64(uint64x2_t a, int64x2_t b); // VQSHL.U64 q0,q0,q0 -//Vector rounding shift left: (negative values shift right) -int8x8_t vrshl_s8(int8x8_t a, int8x8_t b); // VRSHL.S8 d0,d0,d0 -int16x4_t vrshl_s16(int16x4_t a, int16x4_t b); // VRSHL.S16 d0,d0,d0 -int32x2_t vrshl_s32(int32x2_t a, int32x2_t b); // VRSHL.S32 d0,d0,d0 -int64x1_t vrshl_s64(int64x1_t a, int64x1_t b); // VRSHL.S64 d0,d0,d0 -uint8x8_t vrshl_u8(uint8x8_t a, int8x8_t b); // VRSHL.U8 d0,d0,d0 -uint16x4_t vrshl_u16(uint16x4_t a, int16x4_t b); // VRSHL.U16 d0,d0,d0 -uint32x2_t vrshl_u32(uint32x2_t a, int32x2_t b); // VRSHL.U32 d0,d0,d0 -uint64x1_t vrshl_u64(uint64x1_t a, int64x1_t b); // VRSHL.U64 d0,d0,d0 -int8x16_t vrshlq_s8(int8x16_t a, int8x16_t b); // VRSHL.S8 q0,q0,q0 -int16x8_t vrshlq_s16(int16x8_t a, int16x8_t b); // VRSHL.S16 q0,q0,q0 -int32x4_t vrshlq_s32(int32x4_t a, int32x4_t b); // VRSHL.S32 q0,q0,q0 -int64x2_t vrshlq_s64(int64x2_t a, int64x2_t b); // VRSHL.S64 q0,q0,q0 -uint8x16_t vrshlq_u8(uint8x16_t a, int8x16_t b); // VRSHL.U8 q0,q0,q0 -uint16x8_t vrshlq_u16(uint16x8_t a, int16x8_t b); // VRSHL.U16 q0,q0,q0 -uint32x4_t vrshlq_u32(uint32x4_t a, int32x4_t b); // VRSHL.U32 q0,q0,q0 -uint64x2_t vrshlq_u64(uint64x2_t a, int64x2_t b); // VRSHL.U64 q0,q0,q0 -//Vector saturating rounding shift left: (negative values shift right) -int8x8_t vqrshl_s8(int8x8_t a, int8x8_t b); // VQRSHL.S8 d0,d0,d0 -int16x4_t vqrshl_s16(int16x4_t a, int16x4_t b); // VQRSHL.S16 d0,d0,d0 -int32x2_t vqrshl_s32(int32x2_t a, int32x2_t b); // VQRSHL.S32 d0,d0,d0 -int64x1_t vqrshl_s64(int64x1_t a, int64x1_t b); // VQRSHL.S64 d0,d0,d0 -uint8x8_t vqrshl_u8(uint8x8_t a, int8x8_t b); // VQRSHL.U8 d0,d0,d0 -uint16x4_t vqrshl_u16(uint16x4_t a, int16x4_t b); // VQRSHL.U16 d0,d0,d0 -uint32x2_t vqrshl_u32(uint32x2_t a, int32x2_t b); // VQRSHL.U32 d0,d0,d0 -uint64x1_t vqrshl_u64(uint64x1_t a, int64x1_t b); // VQRSHL.U64 d0,d0,d0 -int8x16_t vqrshlq_s8(int8x16_t a, int8x16_t b); // VQRSHL.S8 q0,q0,q0 -int16x8_t vqrshlq_s16(int16x8_t a, int16x8_t b); // VQRSHL.S16 q0,q0,q0 -int32x4_t vqrshlq_s32(int32x4_t a, int32x4_t b); // VQRSHL.S32 q0,q0,q0 -int64x2_t vqrshlq_s64(int64x2_t a, int64x2_t b); // VQRSHL.S64 q0,q0,q0 -uint8x16_t vqrshlq_u8(uint8x16_t a, int8x16_t b); // VQRSHL.U8 q0,q0,q0 -uint16x8_t vqrshlq_u16(uint16x8_t a, int16x8_t b); // VQRSHL.U16 q0,q0,q0 -uint32x4_t vqrshlq_u32(uint32x4_t a, int32x4_t b); // VQRSHL.U32 q0,q0,q0 -uint64x2_t vqrshlq_u64(uint64x2_t a, int64x2_t b); // VQRSHL.U64 q0,q0,q0 -//Shifts by a constant -//Vector shift right by constant -int8x8_t vshr_n_s8(int8x8_t a, __constrange(1,8) int b); // VSHR.S8 d0,d0,#8 -int16x4_t vshr_n_s16(int16x4_t a, __constrange(1,16) int b); // VSHR.S16 d0,d0,#16 -int32x2_t vshr_n_s32(int32x2_t a, __constrange(1,32) int b); // VSHR.S32 d0,d0,#32 -int64x1_t vshr_n_s64(int64x1_t a, __constrange(1,64) int b); // VSHR.S64 d0,d0,#64 -uint8x8_t vshr_n_u8(uint8x8_t a, __constrange(1,8) int b); // VSHR.U8 d0,d0,#8 -uint16x4_t vshr_n_u16(uint16x4_t a, __constrange(1,16) int b); // VSHR.U16 d0,d0,#16 -uint32x2_t vshr_n_u32(uint32x2_t a, __constrange(1,32) int b); // VSHR.U32 d0,d0,#32 -uint64x1_t vshr_n_u64(uint64x1_t a, __constrange(1,64) int b); // VSHR.U64 d0,d0,#64 -int8x16_t vshrq_n_s8(int8x16_t a, __constrange(1,8) int b); // VSHR.S8 q0,q0,#8 -int16x8_t vshrq_n_s16(int16x8_t a, __constrange(1,16) int b); // VSHR.S16 q0,q0,#16 -int32x4_t vshrq_n_s32(int32x4_t a, __constrange(1,32) int b); // VSHR.S32 q0,q0,#32 -int64x2_t vshrq_n_s64(int64x2_t a, __constrange(1,64) int b); // VSHR.S64 q0,q0,#64 -uint8x16_t vshrq_n_u8(uint8x16_t a, __constrange(1,8) int b); // VSHR.U8 q0,q0,#8 -uint16x8_t vshrq_n_u16(uint16x8_t a, __constrange(1,16) int b); // VSHR.U16 q0,q0,#16 -uint32x4_t vshrq_n_u32(uint32x4_t a, __constrange(1,32) int b); // VSHR.U32 q0,q0,#32 -uint64x2_t vshrq_n_u64(uint64x2_t a, __constrange(1,64) int b); // VSHR.U64 q0,q0,#64 -//Vector shift left by constant -int8x8_t vshl_n_s8(int8x8_t a, __constrange(0,7) int b); // VSHL.I8 d0,d0,#0 -int16x4_t vshl_n_s16(int16x4_t a, __constrange(0,15) int b); // VSHL.I16 d0,d0,#0 -int32x2_t vshl_n_s32(int32x2_t a, __constrange(0,31) int b); // VSHL.I32 d0,d0,#0 -int64x1_t vshl_n_s64(int64x1_t a, __constrange(0,63) int b); // VSHL.I64 d0,d0,#0 -uint8x8_t vshl_n_u8(uint8x8_t a, __constrange(0,7) int b); // VSHL.I8 d0,d0,#0 -uint16x4_t vshl_n_u16(uint16x4_t a, __constrange(0,15) int b); // VSHL.I16 d0,d0,#0 -uint32x2_t vshl_n_u32(uint32x2_t a, __constrange(0,31) int b); // VSHL.I32 d0,d0,#0 -uint64x1_t vshl_n_u64(uint64x1_t a, __constrange(0,63) int b); // VSHL.I64 d0,d0,#0 -int8x16_t vshlq_n_s8(int8x16_t a, __constrange(0,7) int b); // VSHL.I8 q0,q0,#0 -int16x8_t vshlq_n_s16(int16x8_t a, __constrange(0,15) int b); // VSHL.I16 q0,q0,#0 -int32x4_t vshlq_n_s32(int32x4_t a, __constrange(0,31) int b); // VSHL.I32 q0,q0,#0 -int64x2_t vshlq_n_s64(int64x2_t a, __constrange(0,63) int b); // VSHL.I64 q0,q0,#0 -uint8x16_t vshlq_n_u8(uint8x16_t a, __constrange(0,7) int b); // VSHL.I8 q0,q0,#0 -uint16x8_t vshlq_n_u16(uint16x8_t a, __constrange(0,15) int b); // VSHL.I16 q0,q0,#0 -uint32x4_t vshlq_n_u32(uint32x4_t a, __constrange(0,31) int b); // VSHL.I32 q0,q0,#0 -uint64x2_t vshlq_n_u64(uint64x2_t a, __constrange(0,63) int b); // VSHL.I64 q0,q0,#0 -//Vector rounding shift right by constant -int8x8_t vrshr_n_s8(int8x8_t a, __constrange(1,8) int b); // VRSHR.S8 d0,d0,#8 -int16x4_t vrshr_n_s16(int16x4_t a, __constrange(1,16) int b); // VRSHR.S16 d0,d0,#16 -int32x2_t vrshr_n_s32(int32x2_t a, __constrange(1,32) int b); // VRSHR.S32 d0,d0,#32 -int64x1_t vrshr_n_s64(int64x1_t a, __constrange(1,64) int b); // VRSHR.S64 d0,d0,#64 -uint8x8_t vrshr_n_u8(uint8x8_t a, __constrange(1,8) int b); // VRSHR.U8 d0,d0,#8 -uint16x4_t vrshr_n_u16(uint16x4_t a, __constrange(1,16) int b); // VRSHR.U16 d0,d0,#16 -uint32x2_t vrshr_n_u32(uint32x2_t a, __constrange(1,32) int b); // VRSHR.U32 d0,d0,#32 -uint64x1_t vrshr_n_u64(uint64x1_t a, __constrange(1,64) int b); // VRSHR.U64 d0,d0,#64 -int8x16_t vrshrq_n_s8(int8x16_t a, __constrange(1,8) int b); // VRSHR.S8 q0,q0,#8 -int16x8_t vrshrq_n_s16(int16x8_t a, __constrange(1,16) int b); // VRSHR.S16 q0,q0,#16 -int32x4_t vrshrq_n_s32(int32x4_t a, __constrange(1,32) int b); // VRSHR.S32 q0,q0,#32 -int64x2_t vrshrq_n_s64(int64x2_t a, __constrange(1,64) int b); // VRSHR.S64 q0,q0,#64 -uint8x16_t vrshrq_n_u8(uint8x16_t a, __constrange(1,8) int b); // VRSHR.U8 q0,q0,#8 -uint16x8_t vrshrq_n_u16(uint16x8_t a, __constrange(1,16) int b); // VRSHR.U16 q0,q0,#16 -uint32x4_t vrshrq_n_u32(uint32x4_t a, __constrange(1,32) int b); // VRSHR.U32 q0,q0,#32 -uint64x2_t vrshrq_n_u64(uint64x2_t a, __constrange(1,64) int b); // VRSHR.U64 q0,q0,#64 -//Vector shift right by constant and accumulate -int8x8_t vsra_n_s8(int8x8_t a, int8x8_t b, __constrange(1,8) int c); // VSRA.S8 d0,d0,#8 -int16x4_t vsra_n_s16(int16x4_t a, int16x4_t b, __constrange(1,16) int c); // VSRA.S16 d0,d0,#16 -int32x2_t vsra_n_s32(int32x2_t a, int32x2_t b, __constrange(1,32) int c); // VSRA.S32 d0,d0,#32 -int64x1_t vsra_n_s64(int64x1_t a, int64x1_t b, __constrange(1,64) int c); // VSRA.S64 d0,d0,#64 -uint8x8_t vsra_n_u8(uint8x8_t a, uint8x8_t b, __constrange(1,8) int c); // VSRA.U8 d0,d0,#8 -uint16x4_t vsra_n_u16(uint16x4_t a, uint16x4_t b, __constrange(1,16) int c); // VSRA.U16 d0,d0,#16 -uint32x2_t vsra_n_u32(uint32x2_t a, uint32x2_t b, __constrange(1,32) int c); // VSRA.U32 d0,d0,#32 -uint64x1_t vsra_n_u64(uint64x1_t a, uint64x1_t b, __constrange(1,64) int c); // VSRA.U64 d0,d0,#64 -int8x16_t vsraq_n_s8(int8x16_t a, int8x16_t b, __constrange(1,8) int c); // VSRA.S8 q0,q0,#8 -int16x8_t vsraq_n_s16(int16x8_t a, int16x8_t b, __constrange(1,16) int c); // VSRA.S16 q0,q0,#16 -int32x4_t vsraq_n_s32(int32x4_t a, int32x4_t b, __constrange(1,32) int c); // VSRA.S32 q0,q0,#32 -int64x2_t vsraq_n_s64(int64x2_t a, int64x2_t b, __constrange(1,64) int c); // VSRA.S64 q0,q0,#64 -uint8x16_t vsraq_n_u8(uint8x16_t a, uint8x16_t b, __constrange(1,8) int c); // VSRA.U8 q0,q0,#8 -uint16x8_t vsraq_n_u16(uint16x8_t a, uint16x8_t b, __constrange(1,16) int c); // VSRA.U16 q0,q0,#16 -uint32x4_t vsraq_n_u32(uint32x4_t a, uint32x4_t b, __constrange(1,32) int c); // VSRA.U32 q0,q0,#32 -uint64x2_t vsraq_n_u64(uint64x2_t a, uint64x2_t b, __constrange(1,64) int c); // VSRA.U64 q0,q0,#64 -//Vector rounding shift right by constant and accumulate -int8x8_t vrsra_n_s8(int8x8_t a, int8x8_t b, __constrange(1,8) int c); // VRSRA.S8 d0,d0,#8 -int16x4_t vrsra_n_s16(int16x4_t a, int16x4_t b, __constrange(1,16) int c); // VRSRA.S16 d0,d0,#16 -int32x2_t vrsra_n_s32(int32x2_t a, int32x2_t b, __constrange(1,32) int c); // VRSRA.S32 d0,d0,#32 -int64x1_t vrsra_n_s64(int64x1_t a, int64x1_t b, __constrange(1,64) int c); // VRSRA.S64 d0,d0,#64 -uint8x8_t vrsra_n_u8(uint8x8_t a, uint8x8_t b, __constrange(1,8) int c); // VRSRA.U8 d0,d0,#8 -uint16x4_t vrsra_n_u16(uint16x4_t a, uint16x4_t b, __constrange(1,16) int c); // VRSRA.U16 d0,d0,#16 -uint32x2_t vrsra_n_u32(uint32x2_t a, uint32x2_t b, __constrange(1,32) int c); // VRSRA.U32 d0,d0,#32 -uint64x1_t vrsra_n_u64(uint64x1_t a, uint64x1_t b, __constrange(1,64) int c); // VRSRA.U64 d0,d0,#64 -int8x16_t vrsraq_n_s8(int8x16_t a, int8x16_t b, __constrange(1,8) int c); // VRSRA.S8 q0,q0,#8 -int16x8_t vrsraq_n_s16(int16x8_t a, int16x8_t b, __constrange(1,16) int c); // VRSRA.S16 q0,q0,#16 -int32x4_t vrsraq_n_s32(int32x4_t a, int32x4_t b, __constrange(1,32) int c); // VRSRA.S32 q0,q0,#32 -int64x2_t vrsraq_n_s64(int64x2_t a, int64x2_t b, __constrange(1,64) int c); // VRSRA.S64 q0,q0,#64 -uint8x16_t vrsraq_n_u8(uint8x16_t a, uint8x16_t b, __constrange(1,8) int c); // VRSRA.U8 q0,q0,#8 -uint16x8_t vrsraq_n_u16(uint16x8_t a, uint16x8_t b, __constrange(1,16) int c); // VRSRA.U16 q0,q0,#16 -uint32x4_t vrsraq_n_u32(uint32x4_t a, uint32x4_t b, __constrange(1,32) int c); // VRSRA.U32 q0,q0,#32 -uint64x2_t vrsraq_n_u64(uint64x2_t a, uint64x2_t b, __constrange(1,64) int c); // VRSRA.U64 q0,q0,#64 -//Vector saturating shift left by constant -int8x8_t vqshl_n_s8(int8x8_t a, __constrange(0,7) int b); // VQSHL.S8 d0,d0,#0 -int16x4_t vqshl_n_s16(int16x4_t a, __constrange(0,15) int b); // VQSHL.S16 d0,d0,#0 -int32x2_t vqshl_n_s32(int32x2_t a, __constrange(0,31) int b); // VQSHL.S32 d0,d0,#0 -int64x1_t vqshl_n_s64(int64x1_t a, __constrange(0,63) int b); // VQSHL.S64 d0,d0,#0 -uint8x8_t vqshl_n_u8(uint8x8_t a, __constrange(0,7) int b); // VQSHL.U8 d0,d0,#0 -uint16x4_t vqshl_n_u16(uint16x4_t a, __constrange(0,15) int b); // VQSHL.U16 d0,d0,#0 -uint32x2_t vqshl_n_u32(uint32x2_t a, __constrange(0,31) int b); // VQSHL.U32 d0,d0,#0 -uint64x1_t vqshl_n_u64(uint64x1_t a, __constrange(0,63) int b); // VQSHL.U64 d0,d0,#0 -int8x16_t vqshlq_n_s8(int8x16_t a, __constrange(0,7) int b); // VQSHL.S8 q0,q0,#0 -int16x8_t vqshlq_n_s16(int16x8_t a, __constrange(0,15) int b); // VQSHL.S16 q0,q0,#0 -int32x4_t vqshlq_n_s32(int32x4_t a, __constrange(0,31) int b); // VQSHL.S32 q0,q0,#0 -int64x2_t vqshlq_n_s64(int64x2_t a, __constrange(0,63) int b); // VQSHL.S64 q0,q0,#0 -uint8x16_t vqshlq_n_u8(uint8x16_t a, __constrange(0,7) int b); // VQSHL.U8 q0,q0,#0 -uint16x8_t vqshlq_n_u16(uint16x8_t a, __constrange(0,15) int b); // VQSHL.U16 q0,q0,#0 -uint32x4_t vqshlq_n_u32(uint32x4_t a, __constrange(0,31) int b); // VQSHL.U32 q0,q0,#0 -uint64x2_t vqshlq_n_u64(uint64x2_t a, __constrange(0,63) int b); // VQSHL.U64 q0,q0,#0 -//Vector signed->unsigned saturating shift left by constant -uint8x8_t vqshlu_n_s8(int8x8_t a, __constrange(0,7) int b); // VQSHLU.S8 d0,d0,#0 -uint16x4_t vqshlu_n_s16(int16x4_t a, __constrange(0,15) int b); // VQSHLU.S16 d0,d0,#0 -uint32x2_t vqshlu_n_s32(int32x2_t a, __constrange(0,31) int b); // VQSHLU.S32 d0,d0,#0 -uint64x1_t vqshlu_n_s64(int64x1_t a, __constrange(0,63) int b); // VQSHLU.S64 d0,d0,#0 -uint8x16_t vqshluq_n_s8(int8x16_t a, __constrange(0,7) int b); // VQSHLU.S8 q0,q0,#0 -uint16x8_t vqshluq_n_s16(int16x8_t a, __constrange(0,15) int b); // VQSHLU.S16 q0,q0,#0 -uint32x4_t vqshluq_n_s32(int32x4_t a, __constrange(0,31) int b); // VQSHLU.S32 q0,q0,#0 -uint64x2_t vqshluq_n_s64(int64x2_t a, __constrange(0,63) int b); // VQSHLU.S64 q0,q0,#0 -//Vector narrowing shift right by constant -int8x8_t vshrn_n_s16(int16x8_t a, __constrange(1,8) int b); // VSHRN.I16 d0,q0,#8 -int16x4_t vshrn_n_s32(int32x4_t a, __constrange(1,16) int b); // VSHRN.I32 d0,q0,#16 -int32x2_t vshrn_n_s64(int64x2_t a, __constrange(1,32) int b); // VSHRN.I64 d0,q0,#32 -uint8x8_t vshrn_n_u16(uint16x8_t a, __constrange(1,8) int b); // VSHRN.I16 d0,q0,#8 -uint16x4_t vshrn_n_u32(uint32x4_t a, __constrange(1,16) int b); // VSHRN.I32 d0,q0,#16 -uint32x2_t vshrn_n_u64(uint64x2_t a, __constrange(1,32) int b); // VSHRN.I64 d0,q0,#32 -//Vector signed->unsigned narrowing saturating shift right by constant -uint8x8_t vqshrun_n_s16(int16x8_t a, __constrange(1,8) int b); // VQSHRUN.S16 d0,q0,#8 -uint16x4_t vqshrun_n_s32(int32x4_t a, __constrange(1,16) int b); // VQSHRUN.S32 d0,q0,#16 -uint32x2_t vqshrun_n_s64(int64x2_t a, __constrange(1,32) int b); // VQSHRUN.S64 d0,q0,#32 -//Vector signed->unsigned rounding narrowing saturating shift right by constant -uint8x8_t vqrshrun_n_s16(int16x8_t a, __constrange(1,8) int b); // VQRSHRUN.S16 d0,q0,#8 -uint16x4_t vqrshrun_n_s32(int32x4_t a, __constrange(1,16) int b); // VQRSHRUN.S32 d0,q0,#16 -uint32x2_t vqrshrun_n_s64(int64x2_t a, __constrange(1,32) int b); // VQRSHRUN.S64 d0,q0,#32 -//Vector narrowing saturating shift right by constant -int8x8_t vqshrn_n_s16(int16x8_t a, __constrange(1,8) int b); // VQSHRN.S16 d0,q0,#8 -int16x4_t vqshrn_n_s32(int32x4_t a, __constrange(1,16) int b); // VQSHRN.S32 d0,q0,#16 -int32x2_t vqshrn_n_s64(int64x2_t a, __constrange(1,32) int b); // VQSHRN.S64 d0,q0,#32 -uint8x8_t vqshrn_n_u16(uint16x8_t a, __constrange(1,8) int b); // VQSHRN.U16 d0,q0,#8 -uint16x4_t vqshrn_n_u32(uint32x4_t a, __constrange(1,16) int b); // VQSHRN.U32 d0,q0,#16 -uint32x2_t vqshrn_n_u64(uint64x2_t a, __constrange(1,32) int b); // VQSHRN.U64 d0,q0,#32 -//Vector rounding narrowing shift right by constant -int8x8_t vrshrn_n_s16(int16x8_t a, __constrange(1,8) int b); // VRSHRN.I16 d0,q0,#8 -int16x4_t vrshrn_n_s32(int32x4_t a, __constrange(1,16) int b); // VRSHRN.I32 d0,q0,#16 -int32x2_t vrshrn_n_s64(int64x2_t a, __constrange(1,32) int b); // VRSHRN.I64 d0,q0,#32 -uint8x8_t vrshrn_n_u16(uint16x8_t a, __constrange(1,8) int b); // VRSHRN.I16 d0,q0,#8 -uint16x4_t vrshrn_n_u32(uint32x4_t a, __constrange(1,16) int b); // VRSHRN.I32 d0,q0,#16 -uint32x2_t vrshrn_n_u64(uint64x2_t a, __constrange(1,32) int b); // VRSHRN.I64 d0,q0,#32 -//Vector rounding narrowing saturating shift right by constant -int8x8_t vqrshrn_n_s16(int16x8_t a, __constrange(1,8) int b); // VQRSHRN.S16 d0,q0,#8 -int16x4_t vqrshrn_n_s32(int32x4_t a, __constrange(1,16) int b); // VQRSHRN.S32 d0,q0,#16 -int32x2_t vqrshrn_n_s64(int64x2_t a, __constrange(1,32) int b); // VQRSHRN.S64 d0,q0,#32 -uint8x8_t vqrshrn_n_u16(uint16x8_t a, __constrange(1,8) int b); // VQRSHRN.U16 d0,q0,#8 -uint16x4_t vqrshrn_n_u32(uint32x4_t a, __constrange(1,16) int b); // VQRSHRN.U32 d0,q0,#16 -uint32x2_t vqrshrn_n_u64(uint64x2_t a, __constrange(1,32) int b); // VQRSHRN.U64 d0,q0,#32 -//Vector widening shift left by constant -int16x8_t vshll_n_s8(int8x8_t a, __constrange(0,8) int b); // VSHLL.S8 q0,d0,#0 -int32x4_t vshll_n_s16(int16x4_t a, __constrange(0,16) int b); // VSHLL.S16 q0,d0,#0 -int64x2_t vshll_n_s32(int32x2_t a, __constrange(0,32) int b); // VSHLL.S32 q0,d0,#0 -uint16x8_t vshll_n_u8(uint8x8_t a, __constrange(0,8) int b); // VSHLL.U8 q0,d0,#0 -uint32x4_t vshll_n_u16(uint16x4_t a, __constrange(0,16) int b); // VSHLL.U16 q0,d0,#0 -uint64x2_t vshll_n_u32(uint32x2_t a, __constrange(0,32) int b); // VSHLL.U32 q0,d0,#0 -//Shifts with insert -//Vector shift right and insert -int8x8_t vsri_n_s8(int8x8_t a, int8x8_t b, __constrange(1,8) int c); // VSRI.8 d0,d0,#8 -int16x4_t vsri_n_s16(int16x4_t a, int16x4_t b, __constrange(1,16) int c); // VSRI.16 d0,d0,#16 -int32x2_t vsri_n_s32(int32x2_t a, int32x2_t b, __constrange(1,32) int c); // VSRI.32 d0,d0,#32 -int64x1_t vsri_n_s64(int64x1_t a, int64x1_t b, __constrange(1,64) int c); // VSRI.64 d0,d0,#64 -uint8x8_t vsri_n_u8(uint8x8_t a, uint8x8_t b, __constrange(1,8) int c); // VSRI.8 d0,d0,#8 -uint16x4_t vsri_n_u16(uint16x4_t a, uint16x4_t b, __constrange(1,16) int c); // VSRI.16 d0,d0,#16 -uint32x2_t vsri_n_u32(uint32x2_t a, uint32x2_t b, __constrange(1,32) int c); // VSRI.32 d0,d0,#32 -uint64x1_t vsri_n_u64(uint64x1_t a, uint64x1_t b, __constrange(1,64) int c); // VSRI.64 d0,d0,#64 -poly8x8_t vsri_n_p8(poly8x8_t a, poly8x8_t b, __constrange(1,8) int c); // VSRI.8 d0,d0,#8 -poly16x4_t vsri_n_p16(poly16x4_t a, poly16x4_t b, __constrange(1,16) int c); // VSRI.16 d0,d0,#16 -int8x16_t vsriq_n_s8(int8x16_t a, int8x16_t b, __constrange(1,8) int c); // VSRI.8 q0,q0,#8 -int16x8_t vsriq_n_s16(int16x8_t a, int16x8_t b, __constrange(1,16) int c); // VSRI.16 q0,q0,#16 -int32x4_t vsriq_n_s32(int32x4_t a, int32x4_t b, __constrange(1,32) int c); // VSRI.32 q0,q0,#32 -int64x2_t vsriq_n_s64(int64x2_t a, int64x2_t b, __constrange(1,64) int c); // VSRI.64 q0,q0,#64 -uint8x16_t vsriq_n_u8(uint8x16_t a, uint8x16_t b, __constrange(1,8) int c); // VSRI.8 q0,q0,#8 -uint16x8_t vsriq_n_u16(uint16x8_t a, uint16x8_t b, __constrange(1,16) int c); // VSRI.16 q0,q0,#16 -uint32x4_t vsriq_n_u32(uint32x4_t a, uint32x4_t b, __constrange(1,32) int c); // VSRI.32 q0,q0,#32 -uint64x2_t vsriq_n_u64(uint64x2_t a, uint64x2_t b, __constrange(1,64) int c); // VSRI.64 q0,q0,#64 -poly8x16_t vsriq_n_p8(poly8x16_t a, poly8x16_t b, __constrange(1,8) int c); // VSRI.8 q0,q0,#8 -poly16x8_t vsriq_n_p16(poly16x8_t a, poly16x8_t b, __constrange(1,16) int c); // VSRI.16 q0,q0,#16 -//Vector shift left and insert -int8x8_t vsli_n_s8(int8x8_t a, int8x8_t b, __constrange(0,7) int c); // VSLI.8 d0,d0,#0 -int16x4_t vsli_n_s16(int16x4_t a, int16x4_t b, __constrange(0,15) int c); // VSLI.16 d0,d0,#0 -int32x2_t vsli_n_s32(int32x2_t a, int32x2_t b, __constrange(0,31) int c); // VSLI.32 d0,d0,#0 -int64x1_t vsli_n_s64(int64x1_t a, int64x1_t b, __constrange(0,63) int c); // VSLI.64 d0,d0,#0 -uint8x8_t vsli_n_u8(uint8x8_t a, uint8x8_t b, __constrange(0,7) int c); // VSLI.8 d0,d0,#0 -uint16x4_t vsli_n_u16(uint16x4_t a, uint16x4_t b, __constrange(0,15) int c); // VSLI.16 d0,d0,#0 -uint32x2_t vsli_n_u32(uint32x2_t a, uint32x2_t b, __constrange(0,31) int c); // VSLI.32 d0,d0,#0 -uint64x1_t vsli_n_u64(uint64x1_t a, uint64x1_t b, __constrange(0,63) int c); // VSLI.64 d0,d0,#0 -poly8x8_t vsli_n_p8(poly8x8_t a, poly8x8_t b, __constrange(0,7) int c); // VSLI.8 d0,d0,#0 -poly16x4_t vsli_n_p16(poly16x4_t a, poly16x4_t b, __constrange(0,15) int c); // VSLI.16 d0,d0,#0 -int8x16_t vsliq_n_s8(int8x16_t a, int8x16_t b, __constrange(0,7) int c); // VSLI.8 q0,q0,#0 -int16x8_t vsliq_n_s16(int16x8_t a, int16x8_t b, __constrange(0,15) int c); // VSLI.16 q0,q0,#0 -int32x4_t vsliq_n_s32(int32x4_t a, int32x4_t b, __constrange(0,31) int c); // VSLI.32 q0,q0,#0 -int64x2_t vsliq_n_s64(int64x2_t a, int64x2_t b, __constrange(0,63) int c); // VSLI.64 q0,q0,#0 -uint8x16_t vsliq_n_u8(uint8x16_t a, uint8x16_t b, __constrange(0,7) int c); // VSLI.8 q0,q0,#0 -uint16x8_t vsliq_n_u16(uint16x8_t a, uint16x8_t b, __constrange(0,15) int c); // VSLI.16 q0,q0,#0 -uint32x4_t vsliq_n_u32(uint32x4_t a, uint32x4_t b, __constrange(0,31) int c); // VSLI.32 q0,q0,#0 -uint64x2_t vsliq_n_u64(uint64x2_t a, uint64x2_t b, __constrange(0,63) int c); // VSLI.64 q0,q0,#0 -poly8x16_t vsliq_n_p8(poly8x16_t a, poly8x16_t b, __constrange(0,7) int c); // VSLI.8 q0,q0,#0 -poly16x8_t vsliq_n_p16(poly16x8_t a, poly16x8_t b, __constrange(0,15) int c); // VSLI.16 q0,q0,#0 -//Loads of a single vector or lane. Perform loads and stores of a single vector of some type. -//Load a single vector from memory -uint8x16_t vld1q_u8(__transfersize(16) uint8_t const * ptr); // VLD1.8 {d0, d1}, [r0] -uint16x8_t vld1q_u16(__transfersize(8) uint16_t const * ptr); // VLD1.16 {d0, d1}, [r0] -uint32x4_t vld1q_u32(__transfersize(4) uint32_t const * ptr); // VLD1.32 {d0, d1}, [r0] -uint64x2_t vld1q_u64(__transfersize(2) uint64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -int8x16_t vld1q_s8(__transfersize(16) int8_t const * ptr); // VLD1.8 {d0, d1}, [r0] -int16x8_t vld1q_s16(__transfersize(8) int16_t const * ptr); // VLD1.16 {d0, d1}, [r0] -int32x4_t vld1q_s32(__transfersize(4) int32_t const * ptr); // VLD1.32 {d0, d1}, [r0] -int64x2_t vld1q_s64(__transfersize(2) int64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -float16x8_t vld1q_f16(__transfersize(8) __fp16 const * ptr); // VLD1.16 {d0, d1}, [r0] -float32x4_t vld1q_f32(__transfersize(4) float32_t const * ptr); // VLD1.32 {d0, d1}, [r0] -poly8x16_t vld1q_p8(__transfersize(16) poly8_t const * ptr); // VLD1.8 {d0, d1}, [r0] -poly16x8_t vld1q_p16(__transfersize(8) poly16_t const * ptr); // VLD1.16 {d0, d1}, [r0] -uint8x8_t vld1_u8(__transfersize(8) uint8_t const * ptr); // VLD1.8 {d0}, [r0] -uint16x4_t vld1_u16(__transfersize(4) uint16_t const * ptr); // VLD1.16 {d0}, [r0] -uint32x2_t vld1_u32(__transfersize(2) uint32_t const * ptr); // VLD1.32 {d0}, [r0] -uint64x1_t vld1_u64(__transfersize(1) uint64_t const * ptr); // VLD1.64 {d0}, [r0] -int8x8_t vld1_s8(__transfersize(8) int8_t const * ptr); // VLD1.8 {d0}, [r0] -int16x4_t vld1_s16(__transfersize(4) int16_t const * ptr); // VLD1.16 {d0}, [r0] -int32x2_t vld1_s32(__transfersize(2) int32_t const * ptr); // VLD1.32 {d0}, [r0] -int64x1_t vld1_s64(__transfersize(1) int64_t const * ptr); // VLD1.64 {d0}, [r0] -float16x4_t vld1_f16(__transfersize(4) __fp16 const * ptr); // VLD1.16 {d0}, [r0] -float32x2_t vld1_f32(__transfersize(2) float32_t const * ptr); // VLD1.32 {d0}, [r0] -poly8x8_t vld1_p8(__transfersize(8) poly8_t const * ptr); // VLD1.8 {d0}, [r0] -poly16x4_t vld1_p16(__transfersize(4) poly16_t const * ptr); // VLD1.16 {d0}, [r0] -//Load a single lane from memory -uint8x16_t vld1q_lane_u8(__transfersize(1) uint8_t const * ptr, uint8x16_t vec, __constrange(0,15) int lane); //VLD1.8 {d0[0]}, [r0] -uint16x8_t vld1q_lane_u16(__transfersize(1) uint16_t const * ptr, uint16x8_t vec, __constrange(0,7) int lane); // VLD1.16 {d0[0]}, [r0] -uint32x4_t vld1q_lane_u32(__transfersize(1) uint32_t const * ptr, uint32x4_t vec, __constrange(0,3) int lane); // VLD1.32 {d0[0]}, [r0] -uint64x2_t vld1q_lane_u64(__transfersize(1) uint64_t const * ptr, uint64x2_t vec, __constrange(0,1) int lane); // VLD1.64 {d0}, [r0] -int8x16_t vld1q_lane_s8(__transfersize(1) int8_t const * ptr, int8x16_t vec, __constrange(0,15) int lane); //VLD1.8 {d0[0]}, [r0] -int16x8_t vld1q_lane_s16(__transfersize(1) int16_t const * ptr, int16x8_t vec, __constrange(0,7) int lane); //VLD1.16 {d0[0]}, [r0] -int32x4_t vld1q_lane_s32(__transfersize(1) int32_t const * ptr, int32x4_t vec, __constrange(0,3) int lane); //VLD1.32 {d0[0]}, [r0] -float16x8_t vld1q_lane_f16(__transfersize(1) __fp16 const * ptr, float16x8_t vec, __constrange(0,7) int lane); //VLD1.16 {d0[0]}, [r0] -float32x4_t vld1q_lane_f32(__transfersize(1) float32_t const * ptr, float32x4_t vec, __constrange(0,3) int lane); // VLD1.32 {d0[0]}, [r0] -int64x2_t vld1q_lane_s64(__transfersize(1) int64_t const * ptr, int64x2_t vec, __constrange(0,1) int lane); //VLD1.64 {d0}, [r0] -poly8x16_t vld1q_lane_p8(__transfersize(1) poly8_t const * ptr, poly8x16_t vec, __constrange(0,15) int lane); //VLD1.8 {d0[0]}, [r0] -poly16x8_t vld1q_lane_p16(__transfersize(1) poly16_t const * ptr, poly16x8_t vec, __constrange(0,7) int lane); // VLD1.16 {d0[0]}, [r0] -uint8x8_t vld1_lane_u8(__transfersize(1) uint8_t const * ptr, uint8x8_t vec, __constrange(0,7) int lane); //VLD1.8 {d0[0]}, [r0] -uint16x4_t vld1_lane_u16(__transfersize(1) uint16_t const * ptr, uint16x4_t vec, __constrange(0,3) int lane); //VLD1.16 {d0[0]}, [r0] -uint32x2_t vld1_lane_u32(__transfersize(1) uint32_t const * ptr, uint32x2_t vec, __constrange(0,1) int lane); //VLD1.32 {d0[0]}, [r0] -uint64x1_t vld1_lane_u64(__transfersize(1) uint64_t const * ptr, uint64x1_t vec, __constrange(0,0) int lane); //VLD1.64 {d0}, [r0] -int8x8_t vld1_lane_s8(__transfersize(1) int8_t const * ptr, int8x8_t vec, __constrange(0,7) int lane); // VLD1.8{d0[0]}, [r0] -int16x4_t vld1_lane_s16(__transfersize(1) int16_t const * ptr, int16x4_t vec, __constrange(0,3) int lane); //VLD1.16 {d0[0]}, [r0] -int32x2_t vld1_lane_s32(__transfersize(1) int32_t const * ptr, int32x2_t vec, __constrange(0,1) int lane); //VLD1.32 {d0[0]}, [r0] -float16x4_t vld1q_lane_f16(__transfersize(1) __fp16 const * ptr, float16x4_t vec, __constrange(0,3) int lane); //VLD1.16 {d0[0]}, [r0] -float32x2_t vld1_lane_f32(__transfersize(1) float32_t const * ptr, float32x2_t vec, __constrange(0,1) int lane); // VLD1.32 {d0[0]}, [r0] -int64x1_t vld1_lane_s64(__transfersize(1) int64_t const * ptr, int64x1_t vec, __constrange(0,0) int lane); //VLD1.64 {d0}, [r0] -poly8x8_t vld1_lane_p8(__transfersize(1) poly8_t const * ptr, poly8x8_t vec, __constrange(0,7) int lane); //VLD1.8 {d0[0]}, [r0] -poly16x4_t vld1_lane_p16(__transfersize(1) poly16_t const * ptr, poly16x4_t vec, __constrange(0,3) int lane); //VLD1.16 {d0[0]}, [r0] -//Load all lanes of vector with same value from memory -uint8x16_t vld1q_dup_u8(__transfersize(1) uint8_t const * ptr); // VLD1.8 {d0[]}, [r0] -uint16x8_t vld1q_dup_u16(__transfersize(1) uint16_t const * ptr); // VLD1.16 {d0[]}, [r0] -uint32x4_t vld1q_dup_u32(__transfersize(1) uint32_t const * ptr); // VLD1.32 {d0[]}, [r0] -uint64x2_t vld1q_dup_u64(__transfersize(1) uint64_t const * ptr); // VLD1.64 {d0}, [r0] -int8x16_t vld1q_dup_s8(__transfersize(1) int8_t const * ptr); // VLD1.8 {d0[]}, [r0] -int16x8_t vld1q_dup_s16(__transfersize(1) int16_t const * ptr); // VLD1.16 {d0[]}, [r0] -int32x4_t vld1q_dup_s32(__transfersize(1) int32_t const * ptr); // VLD1.32 {d0[]}, [r0] -int64x2_t vld1q_dup_s64(__transfersize(1) int64_t const * ptr); // VLD1.64 {d0}, [r0] -float16x8_t vld1q_dup_f16(__transfersize(1) __fp16 const * ptr); // VLD1.16 {d0[]}, [r0] -float32x4_t vld1q_dup_f32(__transfersize(1) float32_t const * ptr); // VLD1.32 {d0[]}, [r0] -poly8x16_t vld1q_dup_p8(__transfersize(1) poly8_t const * ptr); // VLD1.8 {d0[]}, [r0] -poly16x8_t vld1q_dup_p16(__transfersize(1) poly16_t const * ptr); // VLD1.16 {d0[]}, [r0] -uint8x8_t vld1_dup_u8(__transfersize(1) uint8_t const * ptr); // VLD1.8 {d0[]}, [r0] -uint16x4_t vld1_dup_u16(__transfersize(1) uint16_t const * ptr); // VLD1.16 {d0[]}, [r0] -uint32x2_t vld1_dup_u32(__transfersize(1) uint32_t const * ptr); // VLD1.32 {d0[]}, [r0] -uint64x1_t vld1_dup_u64(__transfersize(1) uint64_t const * ptr); // VLD1.64 {d0}, [r0] -int8x8_t vld1_dup_s8(__transfersize(1) int8_t const * ptr); // VLD1.8 {d0[]}, [r0] -int16x4_t vld1_dup_s16(__transfersize(1) int16_t const * ptr); // VLD1.16 {d0[]}, [r0] -int32x2_t vld1_dup_s32(__transfersize(1) int32_t const * ptr); // VLD1.32 {d0[]}, [r0] -int64x1_t vld1_dup_s64(__transfersize(1) int64_t const * ptr); // VLD1.64 {d0}, [r0] -float16x4_t vld1_dup_f16(__transfersize(1) __fp16 const * ptr); // VLD1.16 {d0[]}, [r0] -float32x2_t vld1_dup_f32(__transfersize(1) float32_t const * ptr); // VLD1.32 {d0[]}, [r0] -poly8x8_t vld1_dup_p8(__transfersize(1) poly8_t const * ptr); // VLD1.8 {d0[]}, [r0] -poly16x4_t vld1_dup_p16(__transfersize(1) poly16_t const * ptr); // VLD1.16 {d0[]}, [r0] -//Store a single vector or lane. Stores all lanes or a single lane of a vector. -//Store a single vector into memory -void vst1q_u8(__transfersize(16) uint8_t * ptr, uint8x16_t val); // VST1.8 {d0, d1}, [r0] -void vst1q_u16(__transfersize(8) uint16_t * ptr, uint16x8_t val); // VST1.16 {d0, d1}, [r0] -void vst1q_u32(__transfersize(4) uint32_t * ptr, uint32x4_t val); // VST1.32 {d0, d1}, [r0] -void vst1q_u64(__transfersize(2) uint64_t * ptr, uint64x2_t val); // VST1.64 {d0, d1}, [r0] -void vst1q_s8(__transfersize(16) int8_t * ptr, int8x16_t val); // VST1.8 {d0, d1}, [r0] -void vst1q_s16(__transfersize(8) int16_t * ptr, int16x8_t val); // VST1.16 {d0, d1}, [r0] -void vst1q_s32(__transfersize(4) int32_t * ptr, int32x4_t val); // VST1.32 {d0, d1}, [r0] -void vst1q_s64(__transfersize(2) int64_t * ptr, int64x2_t val); // VST1.64 {d0, d1}, [r0] -void vst1q_f16(__transfersize(8) __fp16 * ptr, float16x8_t val); // VST1.16 {d0, d1}, [r0] -void vst1q_f32(__transfersize(4) float32_t * ptr, float32x4_t val); // VST1.32 {d0, d1}, [r0] -void vst1q_p8(__transfersize(16) poly8_t * ptr, poly8x16_t val); // VST1.8 {d0, d1}, [r0] -void vst1q_p16(__transfersize(8) poly16_t * ptr, poly16x8_t val); // VST1.16 {d0, d1}, [r0] -void vst1_u8(__transfersize(8) uint8_t * ptr, uint8x8_t val); // VST1.8 {d0}, [r0] -void vst1_u16(__transfersize(4) uint16_t * ptr, uint16x4_t val); // VST1.16 {d0}, [r0] -void vst1_u32(__transfersize(2) uint32_t * ptr, uint32x2_t val); // VST1.32 {d0}, [r0] -void vst1_u64(__transfersize(1) uint64_t * ptr, uint64x1_t val); // VST1.64 {d0}, [r0] -void vst1_s8(__transfersize(8) int8_t * ptr, int8x8_t val); // VST1.8 {d0}, [r0] -void vst1_s16(__transfersize(4) int16_t * ptr, int16x4_t val); // VST1.16 {d0}, [r0] -void vst1_s32(__transfersize(2) int32_t * ptr, int32x2_t val); // VST1.32 {d0}, [r0] -void vst1_s64(__transfersize(1) int64_t * ptr, int64x1_t val); // VST1.64 {d0}, [r0] -void vst1_f16(__transfersize(4) __fp16 * ptr, float16x4_t val); // VST1.16 {d0}, [r0] -void vst1_f32(__transfersize(2) float32_t * ptr, float32x2_t val); // VST1.32 {d0}, [r0] -void vst1_p8(__transfersize(8) poly8_t * ptr, poly8x8_t val); // VST1.8 {d0}, [r0] -void vst1_p16(__transfersize(4) poly16_t * ptr, poly16x4_t val); // VST1.16 {d0}, [r0] -//Store a lane of a vector into memory -//Loads of an N-element structure -//Load N-element structure from memory -uint8x16x2_t vld2q_u8(__transfersize(32) uint8_t const * ptr); // VLD2.8 {d0, d2}, [r0] -uint16x8x2_t vld2q_u16(__transfersize(16) uint16_t const * ptr); // VLD2.16 {d0, d2}, [r0] -uint32x4x2_t vld2q_u32(__transfersize(8) uint32_t const * ptr); // VLD2.32 {d0, d2}, [r0] -int8x16x2_t vld2q_s8(__transfersize(32) int8_t const * ptr); // VLD2.8 {d0, d2}, [r0] -int16x8x2_t vld2q_s16(__transfersize(16) int16_t const * ptr); // VLD2.16 {d0, d2}, [r0] -int32x4x2_t vld2q_s32(__transfersize(8) int32_t const * ptr); // VLD2.32 {d0, d2}, [r0] -float16x8x2_t vld2q_f16(__transfersize(16) __fp16 const * ptr); // VLD2.16 {d0, d2}, [r0] -float32x4x2_t vld2q_f32(__transfersize(8) float32_t const * ptr); // VLD2.32 {d0, d2}, [r0] -poly8x16x2_t vld2q_p8(__transfersize(32) poly8_t const * ptr); // VLD2.8 {d0, d2}, [r0] -poly16x8x2_t vld2q_p16(__transfersize(16) poly16_t const * ptr); // VLD2.16 {d0, d2}, [r0] -uint8x8x2_t vld2_u8(__transfersize(16) uint8_t const * ptr); // VLD2.8 {d0, d1}, [r0] -uint16x4x2_t vld2_u16(__transfersize(8) uint16_t const * ptr); // VLD2.16 {d0, d1}, [r0] -uint32x2x2_t vld2_u32(__transfersize(4) uint32_t const * ptr); // VLD2.32 {d0, d1}, [r0] -uint64x1x2_t vld2_u64(__transfersize(2) uint64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -int8x8x2_t vld2_s8(__transfersize(16) int8_t const * ptr); // VLD2.8 {d0, d1}, [r0] -int16x4x2_t vld2_s16(__transfersize(8) int16_t const * ptr); // VLD2.16 {d0, d1}, [r0] -int32x2x2_t vld2_s32(__transfersize(4) int32_t const * ptr); // VLD2.32 {d0, d1}, [r0] -int64x1x2_t vld2_s64(__transfersize(2) int64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -//float16x4x2_t vld2_f16(__transfersize(8) __fp16 const * ptr); // VLD2.16 {d0, d1}, [r0] -float32x2x2_t vld2_f32(__transfersize(4) float32_t const * ptr); // VLD2.32 {d0, d1}, [r0] -poly8x8x2_t vld2_p8(__transfersize(16) poly8_t const * ptr); // VLD2.8 {d0, d1}, [r0] -poly16x4x2_t vld2_p16(__transfersize(8) poly16_t const * ptr); // VLD2.16 {d0, d1}, [r0] -uint8x16x3_t vld3q_u8(__transfersize(48) uint8_t const * ptr); // VLD3.8 {d0, d2, d4}, [r0] -uint16x8x3_t vld3q_u16(__transfersize(24) uint16_t const * ptr); // VLD3.16 {d0, d2, d4}, [r0] -uint32x4x3_t vld3q_u32(__transfersize(12) uint32_t const * ptr); // VLD3.32 {d0, d2, d4}, [r0] -int8x16x3_t vld3q_s8(__transfersize(48) int8_t const * ptr); // VLD3.8 {d0, d2, d4}, [r0] -int16x8x3_t vld3q_s16(__transfersize(24) int16_t const * ptr); // VLD3.16 {d0, d2, d4}, [r0] -int32x4x3_t vld3q_s32(__transfersize(12) int32_t const * ptr); // VLD3.32 {d0, d2, d4}, [r0] -float16x8x3_t vld3q_f16(__transfersize(24) __fp16 const * ptr); // VLD3.16 {d0, d2, d4}, [r0] -float32x4x3_t vld3q_f32(__transfersize(12) float32_t const * ptr); // VLD3.32 {d0, d2, d4}, [r0] -poly8x16x3_t vld3q_p8(__transfersize(48) poly8_t const * ptr); // VLD3.8 {d0, d2, d4}, [r0] -poly16x8x3_t vld3q_p16(__transfersize(24) poly16_t const * ptr); // VLD3.16 {d0, d2, d4}, [r0] -uint8x8x3_t vld3_u8(__transfersize(24) uint8_t const * ptr); // VLD3.8 {d0, d1, d2}, [r0] -uint16x4x3_t vld3_u16(__transfersize(12) uint16_t const * ptr); // VLD3.16 {d0, d1, d2}, [r0] -uint32x2x3_t vld3_u32(__transfersize(6) uint32_t const * ptr); // VLD3.32 {d0, d1, d2}, [r0] -uint64x1x3_t vld3_u64(__transfersize(3) uint64_t const * ptr); // VLD1.64 {d0, d1, d2}, [r0] -int8x8x3_t vld3_s8(__transfersize(24) int8_t const * ptr); // VLD3.8 {d0, d1, d2}, [r0] -int16x4x3_t vld3_s16(__transfersize(12) int16_t const * ptr); // VLD3.16 {d0, d1, d2}, [r0] -int32x2x3_t vld3_s32(__transfersize(6) int32_t const * ptr); // VLD3.32 {d0, d1, d2}, [r0] -int64x1x3_t vld3_s64(__transfersize(3) int64_t const * ptr); // VLD1.64 {d0, d1, d2}, [r0] -float16x4x3_t vld3_f16(__transfersize(12) __fp16 const * ptr); // VLD3.16 {d0, d1, d2}, [r0] -float32x2x3_t vld3_f32(__transfersize(6) float32_t const * ptr); // VLD3.32 {d0, d1, d2}, [r0] -poly8x8x3_t vld3_p8(__transfersize(24) poly8_t const * ptr); // VLD3.8 {d0, d1, d2}, [r0] -poly16x4x3_t vld3_p16(__transfersize(12) poly16_t const * ptr); // VLD3.16 {d0, d1, d2}, [r0] -uint8x16x4_t vld4q_u8(__transfersize(64) uint8_t const * ptr); // VLD4.8 {d0, d2, d4, d6}, [r0] -uint16x8x4_t vld4q_u16(__transfersize(32) uint16_t const * ptr); // VLD4.16 {d0, d2, d4, d6}, [r0] -uint32x4x4_t vld4q_u32(__transfersize(16) uint32_t const * ptr); // VLD4.32 {d0, d2, d4, d6}, [r0] -int8x16x4_t vld4q_s8(__transfersize(64) int8_t const * ptr); // VLD4.8 {d0, d2, d4, d6}, [r0] -int16x8x4_t vld4q_s16(__transfersize(32) int16_t const * ptr); // VLD4.16 {d0, d2, d4, d6}, [r0] -int32x4x4_t vld4q_s32(__transfersize(16) int32_t const * ptr); // VLD4.32 {d0, d2, d4, d6}, [r0] -float16x8x4_t vld4q_f16(__transfersize(32) __fp16 const * ptr); // VLD4.16 {d0, d2, d4, d6}, [r0] -float32x4x4_t vld4q_f32(__transfersize(16) float32_t const * ptr); // VLD4.32 {d0, d2, d4, d6}, [r0] -poly8x16x4_t vld4q_p8(__transfersize(64) poly8_t const * ptr); // VLD4.8 {d0, d2, d4, d6}, [r0] -poly16x8x4_t vld4q_p16(__transfersize(32) poly16_t const * ptr); // VLD4.16 {d0, d2, d4, d6}, [r0] -uint8x8x4_t vld4_u8(__transfersize(32) uint8_t const * ptr); // VLD4.8 {d0, d1, d2, d3}, [r0] -uint16x4x4_t vld4_u16(__transfersize(16) uint16_t const * ptr); // VLD4.16 {d0, d1, d2, d3}, [r0] -uint32x2x4_t vld4_u32(__transfersize(8) uint32_t const * ptr); // VLD4.32 {d0, d1, d2, d3}, [r0] -uint64x1x4_t vld4_u64(__transfersize(4) uint64_t const * ptr); // VLD1.64 {d0, d1, d2, d3}, [r0] -int8x8x4_t vld4_s8(__transfersize(32) int8_t const * ptr); // VLD4.8 {d0, d1, d2, d3}, [r0] -int16x4x4_t vld4_s16(__transfersize(16) int16_t const * ptr); // VLD4.16 {d0, d1, d2, d3}, [r0] -int32x2x4_t vld4_s32(__transfersize(8) int32_t const * ptr); // VLD4.32 {d0, d1, d2, d3}, [r0] -int64x1x4_t vld4_s64(__transfersize(4) int64_t const * ptr); // VLD1.64 {d0, d1, d2, d3}, [r0] -float16x4x4_t vld4_f16(__transfersize(16) __fp16 const * ptr); // VLD4.16 {d0, d1, d2, d3}, [r0] -float32x2x4_t vld4_f32(__transfersize(8) float32_t const * ptr); // VLD4.32 {d0, d1, d2, d3}, [r0] -poly8x8x4_t vld4_p8(__transfersize(32) poly8_t const * ptr); // VLD4.8 {d0, d1, d2, d3}, [r0] -poly16x4x4_t vld4_p16(__transfersize(16) poly16_t const * ptr); // VLD4.16 {d0, d1, d2, d3}, [r0] -//Load all lanes of N-element structure with same value from memory -uint8x8x2_t vld2_dup_u8(__transfersize(2) uint8_t const * ptr); // VLD2.8 {d0[], d1[]}, [r0] -uint16x4x2_t vld2_dup_u16(__transfersize(2) uint16_t const * ptr); // VLD2.16 {d0[], d1[]}, [r0] -uint32x2x2_t vld2_dup_u32(__transfersize(2) uint32_t const * ptr); // VLD2.32 {d0[], d1[]}, [r0] -uint64x1x2_t vld2_dup_u64(__transfersize(2) uint64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -int8x8x2_t vld2_dup_s8(__transfersize(2) int8_t const * ptr); // VLD2.8 {d0[], d1[]}, [r0] -int16x4x2_t vld2_dup_s16(__transfersize(2) int16_t const * ptr); // VLD2.16 {d0[], d1[]}, [r0] -int32x2x2_t vld2_dup_s32(__transfersize(2) int32_t const * ptr); // VLD2.32 {d0[], d1[]}, [r0] -int64x1x2_t vld2_dup_s64(__transfersize(2) int64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -//float16x4x2_t vld2_dup_f16(__transfersize(2) __fp16 const * ptr); // VLD2.16 {d0[], d1[]}, [r0] -float32x2x2_t vld2_dup_f32(__transfersize(2) float32_t const * ptr); // VLD2.32 {d0[], d1[]}, [r0] -poly8x8x2_t vld2_dup_p8(__transfersize(2) poly8_t const * ptr); // VLD2.8 {d0[], d1[]}, [r0] -poly16x4x2_t vld2_dup_p16(__transfersize(2) poly16_t const * ptr); // VLD2.16 {d0[], d1[]}, [r0] -uint8x8x3_t vld3_dup_u8(__transfersize(3) uint8_t const * ptr); // VLD3.8 {d0[], d1[], d2[]}, [r0] -uint16x4x3_t vld3_dup_u16(__transfersize(3) uint16_t const * ptr); // VLD3.16 {d0[], d1[], d2[]}, [r0] -uint32x2x3_t vld3_dup_u32(__transfersize(3) uint32_t const * ptr); // VLD3.32 {d0[], d1[], d2[]}, [r0] -uint64x1x3_t vld3_dup_u64(__transfersize(3) uint64_t const * ptr); // VLD1.64 {d0, d1, d2}, [r0] -int8x8x3_t vld3_dup_s8(__transfersize(3) int8_t const * ptr); // VLD3.8 {d0[], d1[], d2[]}, [r0] -int16x4x3_t vld3_dup_s16(__transfersize(3) int16_t const * ptr); // VLD3.16 {d0[], d1[], d2[]}, [r0] -int32x2x3_t vld3_dup_s32(__transfersize(3) int32_t const * ptr); // VLD3.32 {d0[], d1[], d2[]}, [r0] -int64x1x3_t vld3_dup_s64(__transfersize(3) int64_t const * ptr); // VLD1.64 {d0, d1, d2}, [r0] -float16x4x3_t vld3_dup_f16(__transfersize(3) __fp16 const * ptr); // VLD3.16 {d0[], d1[], d2[]}, [r0] -float32x2x3_t vld3_dup_f32(__transfersize(3) float32_t const * ptr); // VLD3.32 {d0[], d1[], d2[]}, [r0] -poly8x8x3_t vld3_dup_p8(__transfersize(3) poly8_t const * ptr); // VLD3.8 {d0[], d1[], d2[]}, [r0] -poly16x4x3_t vld3_dup_p16(__transfersize(3) poly16_t const * ptr); // VLD3.16 {d0[], d1[], d2[]}, [r0] -uint8x8x4_t vld4_dup_u8(__transfersize(4) uint8_t const * ptr); // VLD4.8 {d0[], d1[], d2[], d3[]}, [r0] -uint16x4x4_t vld4_dup_u16(__transfersize(4) uint16_t const * ptr); // VLD4.16 {d0[], d1[], d2[], d3[]}, [r0] -uint32x2x4_t vld4_dup_u32(__transfersize(4) uint32_t const * ptr); // VLD4.32 {d0[], d1[], d2[], d3[]}, [r0] -uint64x1x4_t vld4_dup_u64(__transfersize(4) uint64_t const * ptr); // VLD1.64 {d0, d1, d2, d3}, [r0] -int8x8x4_t vld4_dup_s8(__transfersize(4) int8_t const * ptr); // VLD4.8 {d0[], d1[], d2[], d3[]}, [r0] -int16x4x4_t vld4_dup_s16(__transfersize(4) int16_t const * ptr); // VLD4.16 {d0[], d1[], d2[], d3[]}, [r0] -int32x2x4_t vld4_dup_s32(__transfersize(4) int32_t const * ptr); // VLD4.32 {d0[], d1[], d2[], d3[]}, [r0] -int64x1x4_t vld4_dup_s64(__transfersize(4) int64_t const * ptr); // VLD1.64 {d0, d1, d2, d3}, [r0] -float16x4x4_t vld4_dup_f16(__transfersize(4) __fp16 const * ptr); // VLD4.16 {d0[], d1[], d2[], d3[]}, [r0] -float32x2x4_t vld4_dup_f32(__transfersize(4) float32_t const * ptr); // VLD4.32 {d0[], d1[], d2[], d3[]}, [r0] -poly8x8x4_t vld4_dup_p8(__transfersize(4) poly8_t const * ptr); // VLD4.8 {d0[], d1[], d2[], d3[]}, [r0] -poly16x4x4_t vld4_dup_p16(__transfersize(4) poly16_t const * ptr); // VLD4.16 {d0[], d1[], d2[], d3[]}, [r0] -//Load a single lane of N-element structure from memory -//the functions below are modified to deal with the error C2719: 'src': formal parameter with __declspec(align('16')) won't be aligned -uint16x8x2_t vld2q_lane_u16_ptr(__transfersize(2) uint16_t const * ptr, uint16x8x2_t * src, __constrange(0,7) int lane); // VLD2.16 {d0[0], d2[0]}, [r0] -uint32x4x2_t vld2q_lane_u32_ptr(__transfersize(2) uint32_t const * ptr, uint32x4x2_t * src, __constrange(0,3) int lane); // VLD2.32 {d0[0], d2[0]}, [r0] -int16x8x2_t vld2q_lane_s16_ptr(__transfersize(2) int16_t const * ptr, int16x8x2_t * src, __constrange(0,7) int lane); // VLD2.16 {d0[0], d2[0]}, [r0] -int32x4x2_t vld2q_lane_s32_ptr(__transfersize(2) int32_t const * ptr, int32x4x2_t * src, __constrange(0,3) int lane); // VLD2.32 {d0[0], d2[0]}, [r0] -float16x8x2_t vld2q_lane_f16_ptr(__transfersize(2) __fp16 const * ptr, float16x8x2_t * src, __constrange(0,7) int lane); // VLD2.16 {d0[0], d2[0]}, [r0] -float32x4x2_t vld2q_lane_f32_ptr(__transfersize(2) float32_t const * ptr, float32x4x2_t * src, __constrange(0,3) int lane); // VLD2.32 {d0[0], d2[0]}, [r0] -poly16x8x2_t vld2q_lane_p16_ptr(__transfersize(2) poly16_t const * ptr, poly16x8x2_t * src, __constrange(0,7) int lane); // VLD2.16 {d0[0], d2[0]}, [r0] -uint8x8x2_t vld2_lane_u8_ptr(__transfersize(2) uint8_t const * ptr, uint8x8x2_t * src, __constrange(0,7) int lane); //VLD2.8 {d0[0], d1[0]}, [r0] -uint16x4x2_t vld2_lane_u16_ptr(__transfersize(2) uint16_t const * ptr, uint16x4x2_t * src, __constrange(0,3) int lane); // VLD2.16 {d0[0], d1[0]}, [r0] -uint32x2x2_t vld2_lane_u32_ptr(__transfersize(2) uint32_t const * ptr, uint32x2x2_t * src, __constrange(0,1) int lane); // VLD2.32 {d0[0], d1[0]}, [r0] -int8x8x2_t vld2_lane_s8_ptr(__transfersize(2) int8_t const * ptr, int8x8x2_t * src, __constrange(0,7) int lane); //VLD2.8 {d0[0], d1[0]}, [r0] -int16x4x2_t vld2_lane_s16_ptr(__transfersize(2) int16_t const * ptr, int16x4x2_t * src, __constrange(0,3) int lane); //VLD2.16 {d0[0], d1[0]}, [r0] -int32x2x2_t vld2_lane_s32_ptr(__transfersize(2) int32_t const * ptr, int32x2x2_t * src, __constrange(0,1) int lane); //VLD2.32 {d0[0], d1[0]}, [r0] -//float16x4x2_t vld2_lane_f16_ptr(__transfersize(2) __fp16 const * ptr, float16x4x2_t * src, __constrange(0,3) int lane); // VLD2.16 {d0[0], d1[0]}, [r0] -float32x2x2_t vld2_lane_f32_ptr(__transfersize(2) float32_t const * ptr, float32x2x2_t * src, __constrange(0,1) int lane); // VLD2.32 {d0[0], d1[0]}, [r0] -poly8x8x2_t vld2_lane_p8_ptr(__transfersize(2) poly8_t const * ptr, poly8x8x2_t * src, __constrange(0,7) int lane); //VLD2.8 {d0[0], d1[0]}, [r0] -poly16x4x2_t vld2_lane_p16_ptr(__transfersize(2) poly16_t const * ptr, poly16x4x2_t * src, __constrange(0,3) int lane); // VLD2.16 {d0[0], d1[0]}, [r0] -uint16x8x3_t vld3q_lane_u16_ptr(__transfersize(3) uint16_t const * ptr, uint16x8x3_t * src, __constrange(0,7) int lane); // VLD3.16 {d0[0], d2[0], d4[0]}, [r0] -uint32x4x3_t vld3q_lane_u32_ptr(__transfersize(3) uint32_t const * ptr, uint32x4x3_t * src, __constrange(0,3) int lane); // VLD3.32 {d0[0], d2[0], d4[0]}, [r0] -int16x8x3_t vld3q_lane_s16_ptr(__transfersize(3) int16_t const * ptr, int16x8x3_t * src, __constrange(0,7) int lane); // VLD3.16 {d0[0], d2[0], d4[0]}, [r0] -int32x4x3_t vld3q_lane_s32_ptr(__transfersize(3) int32_t const * ptr, int32x4x3_t * src, __constrange(0,3) int lane); // VLD3.32 {d0[0], d2[0], d4[0]}, [r0] -float16x8x3_t vld3q_lane_f16_ptr(__transfersize(3) __fp16 const * ptr, float16x8x3_t * src, __constrange(0,7) int lane); // VLD3.16 {d0[0], d2[0], d4[0]}, [r0] -float32x4x3_t vld3q_lane_f32_ptr(__transfersize(3) float32_t const * ptr, float32x4x3_t * src, __constrange(0,3) int lane); // VLD3.32 {d0[0], d2[0], d4[0]}, [r0] -poly16x8x3_t vld3q_lane_p16_ptr(__transfersize(3) poly16_t const * ptr, poly16x8x3_t * src, __constrange(0,7) int lane); // VLD3.16 {d0[0], d2[0], d4[0]}, [r0] -uint8x8x3_t vld3_lane_u8_ptr(__transfersize(3) uint8_t const * ptr, uint8x8x3_t * src, __constrange(0,7) int lane); //VLD3.8 {d0[0], d1[0], d2[0]}, [r0] -uint16x4x3_t vld3_lane_u16_ptr(__transfersize(3) uint16_t const * ptr, uint16x4x3_t * src, __constrange(0,3) int lane); // VLD3.16 {d0[0], d1[0], d2[0]}, [r0] -uint32x2x3_t vld3_lane_u32_ptr(__transfersize(3) uint32_t const * ptr, uint32x2x3_t * src, __constrange(0,1) int lane); // VLD3.32 {d0[0], d1[0], d2[0]}, [r0] -int8x8x3_t vld3_lane_s8_ptr(__transfersize(3) int8_t const * ptr, int8x8x3_t * src, __constrange(0,7) int lane); //VLD3.8 {d0[0], d1[0], d2[0]}, [r0] -int16x4x3_t vld3_lane_s16_ptr(__transfersize(3) int16_t const * ptr, int16x4x3_t * src, __constrange(0,3) int lane); //VLD3.16 {d0[0], d1[0], d2[0]}, [r0] -int32x2x3_t vld3_lane_s32_ptr(__transfersize(3) int32_t const * ptr, int32x2x3_t * src, __constrange(0,1) int lane); //VLD3.32 {d0[0], d1[0], d2[0]}, [r0] -float16x4x3_t vld3_lane_f16_ptr(__transfersize(3) __fp16 const * ptr, float16x4x3_t * src, __constrange(0,3) int lane); // VLD3.16 {d0[0], d1[0], d2[0]}, [r0] -float32x2x3_t vld3_lane_f32_ptr(__transfersize(3) float32_t const * ptr, float32x2x3_t * src, __constrange(0,1) int lane); // VLD3.32 {d0[0], d1[0], d2[0]}, [r0] -poly8x8x3_t vld3_lane_p8_ptr(__transfersize(3) poly8_t const * ptr, poly8x8x3_t * src, __constrange(0,7) int lane); //VLD3.8 {d0[0], d1[0], d2[0]}, [r0] -poly16x4x3_t vld3_lane_p16_ptr(__transfersize(3) poly16_t const * ptr, poly16x4x3_t * src, __constrange(0,3) int lane); // VLD3.16 {d0[0], d1[0], d2[0]}, [r0] -uint16x8x4_t vld4q_lane_u16_ptr(__transfersize(4) uint16_t const * ptr, uint16x8x4_t * src, __constrange(0,7) int lane); // VLD4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -uint32x4x4_t vld4q_lane_u32_ptr(__transfersize(4) uint32_t const * ptr, uint32x4x4_t * src, __constrange(0,3) int lane); // VLD4.32 {d0[0], d2[0], d4[0], d6[0]}, [r0] -int16x8x4_t vld4q_lane_s16_ptr(__transfersize(4) int16_t const * ptr, int16x8x4_t * src, __constrange(0,7) int lane); // VLD4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -int32x4x4_t vld4q_lane_s32_ptr(__transfersize(4) int32_t const * ptr, int32x4x4_t * src, __constrange(0,3) int lane); // VLD4.32 {d0[0], d2[0], d4[0], d6[0]}, [r0] -float16x8x4_t vld4q_lane_f16_ptr(__transfersize(4) __fp16 const * ptr, float16x8x4_t * src, __constrange(0,7) int lane); // VLD4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -float32x4x4_t vld4q_lane_f32_ptr(__transfersize(4) float32_t const * ptr, float32x4x4_t * src, __constrange(0,3) int lane); // VLD4.32 {d0[0], d2[0], d4[0], d6[0]}, [r0] -poly16x8x4_t vld4q_lane_p16_ptr(__transfersize(4) poly16_t const * ptr, poly16x8x4_t * src, __constrange(0,7) int lane); // VLD4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -uint8x8x4_t vld4_lane_u8_ptr(__transfersize(4) uint8_t const * ptr, uint8x8x4_t * src, __constrange(0,7) int lane); //VLD4.8 {d0[0], d1[0], d2[0], d3[0]}, [r0] -uint16x4x4_t vld4_lane_u16_ptr(__transfersize(4) uint16_t const * ptr, uint16x4x4_t * src, __constrange(0,3) int lane); // VLD4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -uint32x2x4_t vld4_lane_u32_ptr(__transfersize(4) uint32_t const * ptr, uint32x2x4_t * src, __constrange(0,1) int lane); // VLD4.32 {d0[0], d1[0], d2[0], d3[0]}, [r0] -int8x8x4_t vld4_lane_s8_ptr(__transfersize(4) int8_t const * ptr, int8x8x4_t * src, __constrange(0,7) int lane); //VLD4.8 {d0[0], d1[0], d2[0], d3[0]}, [r0] -int16x4x4_t vld4_lane_s16_ptr(__transfersize(4) int16_t const * ptr, int16x4x4_t * src, __constrange(0,3) int lane); //VLD4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -int32x2x4_t vld4_lane_s32_ptr(__transfersize(4) int32_t const * ptr, int32x2x4_t * src, __constrange(0,1) int lane); //VLD4.32 {d0[0], d1[0], d2[0], d3[0]}, [r0] -float16x4x4_t vld4_lane_f16_ptr(__transfersize(4) __fp16 const * ptr, float16x4x4_t * src, __constrange(0,3) int lane); // VLD4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -float32x2x4_t vld4_lane_f32_ptr(__transfersize(4) float32_t const * ptr, float32x2x4_t * src, __constrange(0,1) int lane); // VLD4.32 {d0[0], d1[0], d2[0], d3[0]}, [r0] -poly8x8x4_t vld4_lane_p8_ptr(__transfersize(4) poly8_t const * ptr, poly8x8x4_t * src, __constrange(0,7) int lane); //VLD4.8 {d0[0], d1[0], d2[0], d3[0]}, [r0] -poly16x4x4_t vld4_lane_p16_ptr(__transfersize(4) poly16_t const * ptr, poly16x4x4_t * src, __constrange(0,3) int lane); // VLD4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -//Store N-element structure to memory -void vst2q_u8_ptr(__transfersize(32) uint8_t * ptr, uint8x16x2_t * val); // VST2.8 {d0, d2}, [r0] -void vst2q_u16_ptr(__transfersize(16) uint16_t * ptr, uint16x8x2_t * val); // VST2.16 {d0, d2}, [r0] -void vst2q_u32_ptr(__transfersize(8) uint32_t * ptr, uint32x4x2_t * val); // VST2.32 {d0, d2}, [r0] -void vst2q_s8_ptr(__transfersize(32) int8_t * ptr, int8x16x2_t * val); // VST2.8 {d0, d2}, [r0] -void vst2q_s16_ptr(__transfersize(16) int16_t * ptr, int16x8x2_t * val); // VST2.16 {d0, d2}, [r0] -void vst2q_s32_ptr(__transfersize(8) int32_t * ptr, int32x4x2_t * val); // VST2.32 {d0, d2}, [r0] -void vst2q_f16_ptr(__transfersize(16) __fp16 * ptr, float16x8x2_t * val); // VST2.16 {d0, d2}, [r0] -void vst2q_f32_ptr(__transfersize(8) float32_t * ptr, float32x4x2_t * val); // VST2.32 {d0, d2}, [r0] -void vst2q_p8_ptr(__transfersize(32) poly8_t * ptr, poly8x16x2_t * val); // VST2.8 {d0, d2}, [r0] -void vst2q_p16_ptr(__transfersize(16) poly16_t * ptr, poly16x8x2_t * val); // VST2.16 {d0, d2}, [r0] -void vst2_u8_ptr(__transfersize(16) uint8_t * ptr, uint8x8x2_t * val); // VST2.8 {d0, d1}, [r0] -void vst2_u16_ptr(__transfersize(8) uint16_t * ptr, uint16x4x2_t * val); // VST2.16 {d0, d1}, [r0] -void vst2_u32_ptr(__transfersize(4) uint32_t * ptr, uint32x2x2_t * val); // VST2.32 {d0, d1}, [r0] -void vst2_u64_ptr(__transfersize(2) uint64_t * ptr, uint64x1x2_t * val); // VST1.64 {d0, d1}, [r0] -void vst2_s8_ptr(__transfersize(16) int8_t * ptr, int8x8x2_t * val); // VST2.8 {d0, d1}, [r0] -void vst2_s16_ptr(__transfersize(8) int16_t * ptr, int16x4x2_t * val); // VST2.16 {d0, d1}, [r0] -void vst2_s32_ptr(__transfersize(4) int32_t * ptr, int32x2x2_t * val); // VST2.32 {d0, d1}, [r0] -void vst2_s64_ptr(__transfersize(2) int64_t * ptr, int64x1x2_t * val); // VST1.64 {d0, d1}, [r0] -//void vst2_f16_ptr(__transfersize(8) __fp16 * ptr, float16x4x2_t * val); // VST2.16 {d0, d1}, [r0] -void vst2_f32_ptr(__transfersize(4) float32_t * ptr, float32x2x2_t * val); // VST2.32 {d0, d1}, [r0] -void vst2_p8_ptr(__transfersize(16) poly8_t * ptr, poly8x8x2_t * val); // VST2.8 {d0, d1}, [r0] -void vst2_p16_ptr(__transfersize(8) poly16_t * ptr, poly16x4x2_t * val); // VST2.16 {d0, d1}, [r0] -void vst3q_u8_ptr(__transfersize(48) uint8_t * ptr, uint8x16x3_t * val); // VST3.8 {d0, d2, d4}, [r0] -void vst3q_u16_ptr(__transfersize(24) uint16_t * ptr, uint16x8x3_t * val); // VST3.16 {d0, d2, d4}, [r0] -void vst3q_u32_ptr(__transfersize(12) uint32_t * ptr, uint32x4x3_t * val); // VST3.32 {d0, d2, d4}, [r0] -void vst3q_s8_ptr(__transfersize(48) int8_t * ptr, int8x16x3_t * val); // VST3.8 {d0, d2, d4}, [r0] -void vst3q_s16_ptr(__transfersize(24) int16_t * ptr, int16x8x3_t * val); // VST3.16 {d0, d2, d4}, [r0] -void vst3q_s32_ptr(__transfersize(12) int32_t * ptr, int32x4x3_t * val); // VST3.32 {d0, d2, d4}, [r0] -void vst3q_f16_ptr(__transfersize(24) __fp16 * ptr, float16x8x3_t * val); // VST3.16 {d0, d2, d4}, [r0] -void vst3q_f32_ptr(__transfersize(12) float32_t * ptr, float32x4x3_t * val); // VST3.32 {d0, d2, d4}, [r0] -void vst3q_p8_ptr(__transfersize(48) poly8_t * ptr, poly8x16x3_t * val); // VST3.8 {d0, d2, d4}, [r0] -void vst3q_p16_ptr(__transfersize(24) poly16_t * ptr, poly16x8x3_t * val); // VST3.16 {d0, d2, d4}, [r0] -void vst3_u8_ptr(__transfersize(24) uint8_t * ptr, uint8x8x3_t * val); // VST3.8 {d0, d1, d2}, [r0] -void vst3_u16_ptr(__transfersize(12) uint16_t * ptr, uint16x4x3_t * val); // VST3.16 {d0, d1, d2}, [r0] -void vst3_u32_ptr(__transfersize(6) uint32_t * ptr, uint32x2x3_t * val); // VST3.32 {d0, d1, d2}, [r0] -void vst3_u64_ptr(__transfersize(3) uint64_t * ptr, uint64x1x3_t * val); // VST1.64 {d0, d1, d2}, [r0] -void vst3_s8_ptr(__transfersize(24) int8_t * ptr, int8x8x3_t * val); // VST3.8 {d0, d1, d2}, [r0] -void vst3_s16_ptr(__transfersize(12) int16_t * ptr, int16x4x3_t * val); // VST3.16 {d0, d1, d2}, [r0] -void vst3_s32_ptr(__transfersize(6) int32_t * ptr, int32x2x3_t * val); // VST3.32 {d0, d1, d2}, [r0] -void vst3_s64_ptr(__transfersize(3) int64_t * ptr, int64x1x3_t * val); // VST1.64 {d0, d1, d2}, [r0] -void vst3_f16_ptr(__transfersize(12) __fp16 * ptr, float16x4x3_t * val); // VST3.16 {d0, d1, d2}, [r0] -void vst3_f32_ptr(__transfersize(6) float32_t * ptr, float32x2x3_t * val); // VST3.32 {d0, d1, d2}, [r0] -void vst3_p8_ptr(__transfersize(24) poly8_t * ptr, poly8x8x3_t * val); // VST3.8 {d0, d1, d2}, [r0] -void vst3_p16_ptr(__transfersize(12) poly16_t * ptr, poly16x4x3_t * val); // VST3.16 {d0, d1, d2}, [r0] -void vst4q_u8_ptr(__transfersize(64) uint8_t * ptr, uint8x16x4_t * val); // VST4.8 {d0, d2, d4, d6}, [r0] -void vst4q_u16_ptr(__transfersize(32) uint16_t * ptr, uint16x8x4_t * val); // VST4.16 {d0, d2, d4, d6}, [r0] -void vst4q_u32_ptr(__transfersize(16) uint32_t * ptr, uint32x4x4_t * val); // VST4.32 {d0, d2, d4, d6}, [r0] -void vst4q_s8_ptr(__transfersize(64) int8_t * ptr, int8x16x4_t * val); // VST4.8 {d0, d2, d4, d6}, [r0] -void vst4q_s16_ptr(__transfersize(32) int16_t * ptr, int16x8x4_t * val); // VST4.16 {d0, d2, d4, d6}, [r0] -void vst4q_s32_ptr(__transfersize(16) int32_t * ptr, int32x4x4_t * val); // VST4.32 {d0, d2, d4, d6}, [r0] -void vst4q_f16_ptr(__transfersize(32) __fp16 * ptr, float16x8x4_t * val); // VST4.16 {d0, d2, d4, d6}, [r0] -void vst4q_f32_ptr(__transfersize(16) float32_t * ptr, float32x4x4_t * val); // VST4.32 {d0, d2, d4, d6}, [r0] -void vst4q_p8_ptr(__transfersize(64) poly8_t * ptr, poly8x16x4_t * val); // VST4.8 {d0, d2, d4, d6}, [r0] -void vst4q_p16_ptr(__transfersize(32) poly16_t * ptr, poly16x8x4_t * val); // VST4.16 {d0, d2, d4, d6}, [r0] -void vst4_u8_ptr(__transfersize(32) uint8_t * ptr, uint8x8x4_t * val); // VST4.8 {d0, d1, d2, d3}, [r0] -void vst4_u16_ptr(__transfersize(16) uint16_t * ptr, uint16x4x4_t * val); // VST4.16 {d0, d1, d2, d3}, [r0] -void vst4_u32_ptr(__transfersize(8) uint32_t * ptr, uint32x2x4_t * val); // VST4.32 {d0, d1, d2, d3}, [r0] -void vst4_u64_ptr(__transfersize(4) uint64_t * ptr, uint64x1x4_t * val); // VST1.64 {d0, d1, d2, d3}, [r0] -void vst4_s8_ptr(__transfersize(32) int8_t * ptr, int8x8x4_t * val); // VST4.8 {d0, d1, d2, d3}, [r0] -void vst4_s16_ptr(__transfersize(16) int16_t * ptr, int16x4x4_t * val); // VST4.16 {d0, d1, d2, d3}, [r0] -void vst4_s32_ptr(__transfersize(8) int32_t * ptr, int32x2x4_t * val); // VST4.32 {d0, d1, d2, d3}, [r0] -void vst4_s64_ptr(__transfersize(4) int64_t * ptr, int64x1x4_t * val); // VST1.64 {d0, d1, d2, d3}, [r0] -void vst4_f16_ptr(__transfersize(16) __fp16 * ptr, float16x4x4_t * val); // VST4.16 {d0, d1, d2, d3}, [r0] -void vst4_f32_ptr(__transfersize(8) float32_t * ptr, float32x2x4_t * val); // VST4.32 {d0, d1, d2, d3}, [r0] -void vst4_p8_ptr(__transfersize(32) poly8_t * ptr, poly8x8x4_t * val); // VST4.8 {d0, d1, d2, d3}, [r0] -void vst4_p16_ptr(__transfersize(16) poly16_t * ptr, poly16x4x4_t * val); // VST4.16 {d0, d1, d2, d3}, [r0] -//Store a single lane of N-element structure to memory -void vst2q_lane_u16_ptr(__transfersize(2) uint16_t * ptr, uint16x8x2_t * val, __constrange(0,7) int lane); // VST2.16{d0[0], d2[0]}, [r0] -void vst2q_lane_u32_ptr(__transfersize(2) uint32_t * ptr, uint32x4x2_t * val, __constrange(0,3) int lane); // VST2.32{d0[0], d2[0]}, [r0] -void vst2q_lane_s16_ptr(__transfersize(2) int16_t * ptr, int16x8x2_t * val, __constrange(0,7) int lane); // VST2.16{d0[0], d2[0]}, [r0] -void vst2q_lane_s32_ptr(__transfersize(2) int32_t * ptr, int32x4x2_t * val, __constrange(0,3) int lane); // VST2.32{d0[0], d2[0]}, [r0] -void vst2q_lane_f16_ptr(__transfersize(2) __fp16 * ptr, float16x8x2_t * val, __constrange(0,7) int lane); // VST2.16{d0[0], d2[0]}, [r0] -void vst2q_lane_f32_ptr(__transfersize(2) float32_t * ptr, float32x4x2_t * val, __constrange(0,3) int lane); //VST2.32 {d0[0], d2[0]}, [r0] -void vst2q_lane_p16_ptr(__transfersize(2) poly16_t * ptr, poly16x8x2_t * val, __constrange(0,7) int lane); // VST2.16{d0[0], d2[0]}, [r0] -void vst2_lane_u8_ptr(__transfersize(2) uint8_t * ptr, uint8x8x2_t * val, __constrange(0,7) int lane); // VST2.8{d0[0], d1[0]}, [r0] -void vst2_lane_u16_ptr(__transfersize(2) uint16_t * ptr, uint16x4x2_t * val, __constrange(0,3) int lane); // VST2.16{d0[0], d1[0]}, [r0] -void vst2_lane_u32_ptr(__transfersize(2) uint32_t * ptr, uint32x2x2_t * val, __constrange(0,1) int lane); // VST2.32{d0[0], d1[0]}, [r0] -void vst2_lane_s8_ptr(__transfersize(2) int8_t * ptr, int8x8x2_t * val, __constrange(0,7) int lane); // VST2.8 {d0[0],d1[0]}, [r0] -void vst2_lane_s16_ptr(__transfersize(2) int16_t * ptr, int16x4x2_t * val, __constrange(0,3) int lane); // VST2.16{d0[0], d1[0]}, [r0] -void vst2_lane_s32_ptr(__transfersize(2) int32_t * ptr, int32x2x2_t * val, __constrange(0,1) int lane); // VST2.32{d0[0], d1[0]}, [r0] -void vst2_lane_f16_ptr(__transfersize(2) __fp16 * ptr, float16x4x2_t * val, __constrange(0,3) int lane); // VST2.16{d0[0], d1[0]}, [r0] -void vst2_lane_f32_ptr(__transfersize(2) float32_t * ptr, float32x2x2_t * val, __constrange(0,1) int lane); // VST2.32{d0[0], d1[0]}, [r0] -void vst2_lane_p8_ptr(__transfersize(2) poly8_t * ptr, poly8x8x2_t * val, __constrange(0,7) int lane); // VST2.8{d0[0], d1[0]}, [r0] -void vst2_lane_p16_ptr(__transfersize(2) poly16_t * ptr, poly16x4x2_t * val, __constrange(0,3) int lane); // VST2.16{d0[0], d1[0]}, [r0] -void vst3q_lane_u16_ptr(__transfersize(3) uint16_t * ptr, uint16x8x3_t * val, __constrange(0,7) int lane); // VST3.16{d0[0], d2[0], d4[0]}, [r0] -void vst3q_lane_u32_ptr(__transfersize(3) uint32_t * ptr, uint32x4x3_t * val, __constrange(0,3) int lane); // VST3.32{d0[0], d2[0], d4[0]}, [r0] -void vst3q_lane_s16_ptr(__transfersize(3) int16_t * ptr, int16x8x3_t * val, __constrange(0,7) int lane); // VST3.16{d0[0], d2[0], d4[0]}, [r0] -void vst3q_lane_s32_ptr(__transfersize(3) int32_t * ptr, int32x4x3_t * val, __constrange(0,3) int lane); // VST3.32{d0[0], d2[0], d4[0]}, [r0] -void vst3q_lane_f16_ptr(__transfersize(3) __fp16 * ptr, float16x8x3_t * val, __constrange(0,7) int lane); // VST3.16{d0[0], d2[0], d4[0]}, [r0] -void vst3q_lane_f32_ptr(__transfersize(3) float32_t * ptr, float32x4x3_t * val, __constrange(0,3) int lane); //VST3.32 {d0[0], d2[0], d4[0]}, [r0] -void vst3q_lane_p16_ptr(__transfersize(3) poly16_t * ptr, poly16x8x3_t * val, __constrange(0,7) int lane); // VST3.16{d0[0], d2[0], d4[0]}, [r0] -void vst3_lane_u8_ptr(__transfersize(3) uint8_t * ptr, uint8x8x3_t * val, __constrange(0,7) int lane); // VST3.8{d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_u16_ptr(__transfersize(3) uint16_t * ptr, uint16x4x3_t * val, __constrange(0,3) int lane); // VST3.16{d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_u32_ptr(__transfersize(3) uint32_t * ptr, uint32x2x3_t * val, __constrange(0,1) int lane); // VST3.32{d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_s8_ptr(__transfersize(3) int8_t * ptr, int8x8x3_t * val, __constrange(0,7) int lane); // VST3.8 {d0[0],d1[0], d2[0]}, [r0] -void vst3_lane_s16_ptr(__transfersize(3) int16_t * ptr, int16x4x3_t * val, __constrange(0,3) int lane); // VST3.16{d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_s32_ptr(__transfersize(3) int32_t * ptr, int32x2x3_t * val, __constrange(0,1) int lane); // VST3.32{d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_f16_ptr(__transfersize(3) __fp16 * ptr, float16x4x3_t * val, __constrange(0,3) int lane); // VST3.16{d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_f32_ptr(__transfersize(3) float32_t * ptr, float32x2x3_t * val, __constrange(0,1) int lane); // VST3.32{d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_p8_ptr(__transfersize(3) poly8_t * ptr, poly8x8x3_t * val, __constrange(0,7) int lane); // VST3.8{d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_p16_ptr(__transfersize(3) poly16_t * ptr, poly16x4x3_t * val, __constrange(0,3) int lane); // VST3.16{d0[0], d1[0], d2[0]}, [r0] -void vst4q_lane_u16_ptr(__transfersize(4) uint16_t * ptr, uint16x8x4_t * val, __constrange(0,7) int lane); // VST4.16{d0[0], d2[0], d4[0], d6[0]}, [r0] -void vst4q_lane_u32_ptr(__transfersize(4) uint32_t * ptr, uint32x4x4_t * val, __constrange(0,3) int lane); // VST4.32{d0[0], d2[0], d4[0], d6[0]}, [r0] -void vst4q_lane_s16_ptr(__transfersize(4) int16_t * ptr, int16x8x4_t * val, __constrange(0,7) int lane); // VST4.16{d0[0], d2[0], d4[0], d6[0]}, [r0] -void vst4q_lane_s32_ptr(__transfersize(4) int32_t * ptr, int32x4x4_t * val, __constrange(0,3) int lane); // VST4.32{d0[0], d2[0], d4[0], d6[0]}, [r0] -void vst4q_lane_f16_ptr(__transfersize(4) __fp16 * ptr, float16x8x4_t * val, __constrange(0,7) int lane); // VST4.16{d0[0], d2[0], d4[0], d6[0]}, [r0] -void vst4q_lane_f32_ptr(__transfersize(4) float32_t * ptr, float32x4x4_t * val, __constrange(0,3) int lane); //VST4.32 {d0[0], d2[0], d4[0], d6[0]}, [r0] -void vst4q_lane_p16_ptr(__transfersize(4) poly16_t * ptr, poly16x8x4_t * val, __constrange(0,7) int lane); // VST4.16{d0[0], d2[0], d4[0], d6[0]}, [r0] -void vst4_lane_u8_ptr(__transfersize(4) uint8_t * ptr, uint8x8x4_t * val, __constrange(0,7) int lane); // VST4.8{d0[0], d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_u16_ptr(__transfersize(4) uint16_t * ptr, uint16x4x4_t * val, __constrange(0,3) int lane); // VST4.16{d0[0], d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_u32_ptr(__transfersize(4) uint32_t * ptr, uint32x2x4_t * val, __constrange(0,1) int lane); // VST4.32{d0[0], d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_s8_ptr(__transfersize(4) int8_t * ptr, int8x8x4_t * val, __constrange(0,7) int lane); // VST4.8 {d0[0],d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_s16_ptr(__transfersize(4) int16_t * ptr, int16x4x4_t * val, __constrange(0,3) int lane); // VST4.16{d0[0], d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_s32_ptr(__transfersize(4) int32_t * ptr, int32x2x4_t * val, __constrange(0,1) int lane); // VST4.32{d0[0], d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_f16_ptr(__transfersize(4) __fp16 * ptr, float16x4x4_t * val, __constrange(0,3) int lane); // VST4.16{d0[0], d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_f32_ptr(__transfersize(4) float32_t * ptr, float32x2x4_t * val, __constrange(0,1) int lane); // VST4.32{d0[0], d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_p8_ptr(__transfersize(4) poly8_t * ptr, poly8x8x4_t * val, __constrange(0,7) int lane); // VST4.8{d0[0], d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_p16_ptr(__transfersize(4) poly16_t * ptr, poly16x4x4_t * val, __constrange(0,3) int lane); // VST4.16{d0[0], d1[0], d2[0], d3[0]}, [r0] -//Extract lanes from a vector and put into a register. These intrinsics extract a single lane (element) from a vector. -uint8_t vget_lane_u8(uint8x8_t vec, __constrange(0,7) int lane); // VMOV.U8 r0, d0[0] -uint16_t vget_lane_u16(uint16x4_t vec, __constrange(0,3) int lane); // VMOV.U16 r0, d0[0] -uint32_t vget_lane_u32(uint32x2_t vec, __constrange(0,1) int lane); // VMOV.32 r0, d0[0] -int8_t vget_lane_s8(int8x8_t vec, __constrange(0,7) int lane); // VMOV.S8 r0, d0[0] -int16_t vget_lane_s16(int16x4_t vec, __constrange(0,3) int lane); // VMOV.S16 r0, d0[0] -int32_t vget_lane_s32(int32x2_t vec, __constrange(0,1) int lane); // VMOV.32 r0, d0[0] -poly8_t vget_lane_p8(poly8x8_t vec, __constrange(0,7) int lane); // VMOV.U8 r0, d0[0] -poly16_t vget_lane_p16(poly16x4_t vec, __constrange(0,3) int lane); // VMOV.U16 r0, d0[0] -float32_t vget_lane_f32(float32x2_t vec, __constrange(0,1) int lane); // VMOV.32 r0, d0[0] -uint8_t vgetq_lane_u8(uint8x16_t vec, __constrange(0,15) int lane); // VMOV.U8 r0, d0[0] -uint16_t vgetq_lane_u16(uint16x8_t vec, __constrange(0,7) int lane); // VMOV.U16 r0, d0[0] -uint32_t vgetq_lane_u32(uint32x4_t vec, __constrange(0,3) int lane); // VMOV.32 r0, d0[0] -int8_t vgetq_lane_s8(int8x16_t vec, __constrange(0,15) int lane); // VMOV.S8 r0, d0[0] -int16_t vgetq_lane_s16(int16x8_t vec, __constrange(0,7) int lane); // VMOV.S16 r0, d0[0] -int32_t vgetq_lane_s32(int32x4_t vec, __constrange(0,3) int lane); // VMOV.32 r0, d0[0] -poly8_t vgetq_lane_p8(poly8x16_t vec, __constrange(0,15) int lane); // VMOV.U8 r0, d0[0] -poly16_t vgetq_lane_p16(poly16x8_t vec, __constrange(0,7) int lane); // VMOV.U16 r0, d0[0] -float32_t vgetq_lane_f32(float32x4_t vec, __constrange(0,3) int lane); // VMOV.32 r0, d0[0] -int64_t vget_lane_s64(int64x1_t vec, __constrange(0,0) int lane); // VMOV r0,r0,d0 -uint64_t vget_lane_u64(uint64x1_t vec, __constrange(0,0) int lane); // VMOV r0,r0,d0 -int64_t vgetq_lane_s64(int64x2_t vec, __constrange(0,1) int lane); // VMOV r0,r0,d0 -uint64_t vgetq_lane_u64(uint64x2_t vec, __constrange(0,1) int lane); // VMOV r0,r0,d0 -//Load a single lane of a vector from a literal. These intrinsics set a single lane (element) within a vector. -uint8x8_t vset_lane_u8(uint8_t value, uint8x8_t vec, __constrange(0,7) int lane); // VMOV.8 d0[0],r0 -uint16x4_t vset_lane_u16(uint16_t value, uint16x4_t vec, __constrange(0,3) int lane); // VMOV.16 d0[0],r0 -uint32x2_t vset_lane_u32(uint32_t value, uint32x2_t vec, __constrange(0,1) int lane); // VMOV.32 d0[0],r0 -int8x8_t vset_lane_s8(int8_t value, int8x8_t vec, __constrange(0,7) int lane); // VMOV.8 d0[0],r0 -int16x4_t vset_lane_s16(int16_t value, int16x4_t vec, __constrange(0,3) int lane); // VMOV.16 d0[0],r0 -int32x2_t vset_lane_s32(int32_t value, int32x2_t vec, __constrange(0,1) int lane); // VMOV.32 d0[0],r0 -poly8x8_t vset_lane_p8(poly8_t value, poly8x8_t vec, __constrange(0,7) int lane); // VMOV.8 d0[0],r0 -poly16x4_t vset_lane_p16(poly16_t value, poly16x4_t vec, __constrange(0,3) int lane); // VMOV.16 d0[0],r0 -float32x2_t vset_lane_f32(float32_t value, float32x2_t vec, __constrange(0,1) int lane); // VMOV.32 d0[0],r0 -uint8x16_t vsetq_lane_u8(uint8_t value, uint8x16_t vec, __constrange(0,15) int lane); // VMOV.8 d0[0],r0 -uint16x8_t vsetq_lane_u16(uint16_t value, uint16x8_t vec, __constrange(0,7) int lane); // VMOV.16 d0[0],r0 -uint32x4_t vsetq_lane_u32(uint32_t value, uint32x4_t vec, __constrange(0,3) int lane); // VMOV.32 d0[0],r0 -int8x16_t vsetq_lane_s8(int8_t value, int8x16_t vec, __constrange(0,15) int lane); // VMOV.8 d0[0],r0 -int16x8_t vsetq_lane_s16(int16_t value, int16x8_t vec, __constrange(0,7) int lane); // VMOV.16 d0[0],r0 -int32x4_t vsetq_lane_s32(int32_t value, int32x4_t vec, __constrange(0,3) int lane); // VMOV.32 d0[0],r0 -poly8x16_t vsetq_lane_p8(poly8_t value, poly8x16_t vec, __constrange(0,15) int lane); // VMOV.8 d0[0],r0 -poly16x8_t vsetq_lane_p16(poly16_t value, poly16x8_t vec, __constrange(0,7) int lane); // VMOV.16 d0[0],r0 -float32x4_t vsetq_lane_f32(float32_t value, float32x4_t vec, __constrange(0,3) int lane); // VMOV.32 d0[0],r0 -int64x1_t vset_lane_s64(int64_t value, int64x1_t vec, __constrange(0,0) int lane); // VMOV d0,r0,r0 -uint64x1_t vset_lane_u64(uint64_t value, uint64x1_t vec, __constrange(0,0) int lane); // VMOV d0,r0,r0 -int64x2_t vsetq_lane_s64(int64_t value, int64x2_t vec, __constrange(0,1) int lane); // VMOV d0,r0,r0 -uint64x2_t vsetq_lane_u64(uint64_t value, uint64x2_t vec, __constrange(0,1) int lane); // VMOV d0,r0,r0 -//Initialize a vector from a literal bit pattern. -int8x8_t vcreate_s8(uint64_t a); // VMOV d0,r0,r0 -int16x4_t vcreate_s16(uint64_t a); // VMOV d0,r0,r0 -int32x2_t vcreate_s32(uint64_t a); // VMOV d0,r0,r0 -float16x4_t vcreate_f16(uint64_t a); // VMOV d0,r0,r0 -float32x2_t vcreate_f32(uint64_t a); // VMOV d0,r0,r0 -uint8x8_t vcreate_u8(uint64_t a); // VMOV d0,r0,r0 -uint16x4_t vcreate_u16(uint64_t a); // VMOV d0,r0,r0 -uint32x2_t vcreate_u32(uint64_t a); // VMOV d0,r0,r0 -uint64x1_t vcreate_u64(uint64_t a); // VMOV d0,r0,r0 -poly8x8_t vcreate_p8(uint64_t a); // VMOV d0,r0,r0 -poly16x4_t vcreate_p16(uint64_t a); // VMOV d0,r0,r0 -int64x1_t vcreate_s64(uint64_t a); // VMOV d0,r0,r0 -//Set all lanes to same value -//Load all lanes of vector to the same literal value -uint8x8_t vdup_n_u8(uint8_t value); // VDUP.8 d0,r0 -uint16x4_t vdup_n_u16(uint16_t value); // VDUP.16 d0,r0 -uint32x2_t vdup_n_u32(uint32_t value); // VDUP.32 d0,r0 -int8x8_t vdup_n_s8(int8_t value); // VDUP.8 d0,r0 -int16x4_t vdup_n_s16(int16_t value); // VDUP.16 d0,r0 -int32x2_t vdup_n_s32(int32_t value); // VDUP.32 d0,r0 -poly8x8_t vdup_n_p8(poly8_t value); // VDUP.8 d0,r0 -poly16x4_t vdup_n_p16(poly16_t value); // VDUP.16 d0,r0 -float32x2_t vdup_n_f32(float32_t value); // VDUP.32 d0,r0 -uint8x16_t vdupq_n_u8(uint8_t value); // VDUP.8 q0,r0 -uint16x8_t vdupq_n_u16(uint16_t value); // VDUP.16 q0,r0 -uint32x4_t vdupq_n_u32(uint32_t value); // VDUP.32 q0,r0 -int8x16_t vdupq_n_s8(int8_t value); // VDUP.8 q0,r0 -int16x8_t vdupq_n_s16(int16_t value); // VDUP.16 q0,r0 -int32x4_t vdupq_n_s32(int32_t value); // VDUP.32 q0,r0 -poly8x16_t vdupq_n_p8(poly8_t value); // VDUP.8 q0,r0 -poly16x8_t vdupq_n_p16(poly16_t value); // VDUP.16 q0,r0 -float32x4_t vdupq_n_f32(float32_t value); // VDUP.32 q0,r0 -int64x1_t vdup_n_s64(int64_t value); // VMOV d0,r0,r0 -uint64x1_t vdup_n_u64(uint64_t value); // VMOV d0,r0,r0 -int64x2_t vdupq_n_s64(int64_t value); // VMOV d0,r0,r0 -uint64x2_t vdupq_n_u64(uint64_t value); // VMOV d0,r0,r0 -uint8x8_t vmov_n_u8(uint8_t value); // VDUP.8 d0,r0 -uint16x4_t vmov_n_u16(uint16_t value); // VDUP.16 d0,r0 -uint32x2_t vmov_n_u32(uint32_t value); // VDUP.32 d0,r0 -int8x8_t vmov_n_s8(int8_t value); // VDUP.8 d0,r0 -int16x4_t vmov_n_s16(int16_t value); // VDUP.16 d0,r0 -int32x2_t vmov_n_s32(int32_t value); // VDUP.32 d0,r0 -poly8x8_t vmov_n_p8(poly8_t value); // VDUP.8 d0,r0 -poly16x4_t vmov_n_p16(poly16_t value); // VDUP.16 d0,r0 -float32x2_t vmov_n_f32(float32_t value); // VDUP.32 d0,r0 -uint8x16_t vmovq_n_u8(uint8_t value); // VDUP.8 q0,r0 -uint16x8_t vmovq_n_u16(uint16_t value); // VDUP.16 q0,r0 -uint32x4_t vmovq_n_u32(uint32_t value); // VDUP.32 q0,r0 -int8x16_t vmovq_n_s8(int8_t value); // VDUP.8 q0,r0 -int16x8_t vmovq_n_s16(int16_t value); // VDUP.16 q0,r0 -int32x4_t vmovq_n_s32(int32_t value); // VDUP.32 q0,r0 -poly8x16_t vmovq_n_p8(poly8_t value); // VDUP.8 q0,r0 -poly16x8_t vmovq_n_p16(poly16_t value); // VDUP.16 q0,r0 -float32x4_t vmovq_n_f32(float32_t value); // VDUP.32 q0,r0 -int64x1_t vmov_n_s64(int64_t value); // VMOV d0,r0,r0 -uint64x1_t vmov_n_u64(uint64_t value); // VMOV d0,r0,r0 -int64x2_t vmovq_n_s64(int64_t value); // VMOV d0,r0,r0 -uint64x2_t vmovq_n_u64(uint64_t value); // VMOV d0,r0,r0 -//Load all lanes of the vector to the value of a lane of a vector -uint8x8_t vdup_lane_u8(uint8x8_t vec, __constrange(0,7) int lane); // VDUP.8 d0,d0[0] -uint16x4_t vdup_lane_u16(uint16x4_t vec, __constrange(0,3) int lane); // VDUP.16 d0,d0[0] -uint32x2_t vdup_lane_u32(uint32x2_t vec, __constrange(0,1) int lane); // VDUP.32 d0,d0[0] -int8x8_t vdup_lane_s8(int8x8_t vec, __constrange(0,7) int lane); // VDUP.8 d0,d0[0] -int16x4_t vdup_lane_s16(int16x4_t vec, __constrange(0,3) int lane); // VDUP.16 d0,d0[0] -int32x2_t vdup_lane_s32(int32x2_t vec, __constrange(0,1) int lane); // VDUP.32 d0,d0[0] -poly8x8_t vdup_lane_p8(poly8x8_t vec, __constrange(0,7) int lane); // VDUP.8 d0,d0[0] -poly16x4_t vdup_lane_p16(poly16x4_t vec, __constrange(0,3) int lane); // VDUP.16 d0,d0[0] -float32x2_t vdup_lane_f32(float32x2_t vec, __constrange(0,1) int lane); // VDUP.32 d0,d0[0] -uint8x16_t vdupq_lane_u8(uint8x8_t vec, __constrange(0,7) int lane); // VDUP.8 q0,d0[0] -uint16x8_t vdupq_lane_u16(uint16x4_t vec, __constrange(0,3) int lane); // VDUP.16 q0,d0[0] -uint32x4_t vdupq_lane_u32(uint32x2_t vec, __constrange(0,1) int lane); // VDUP.32 q0,d0[0] -int8x16_t vdupq_lane_s8(int8x8_t vec, __constrange(0,7) int lane); // VDUP.8 q0,d0[0] -int16x8_t vdupq_lane_s16(int16x4_t vec, __constrange(0,3) int lane); // VDUP.16 q0,d0[0] -int32x4_t vdupq_lane_s32(int32x2_t vec, __constrange(0,1) int lane); // VDUP.32 q0,d0[0] -poly8x16_t vdupq_lane_p8(poly8x8_t vec, __constrange(0,7) int lane); // VDUP.8 q0,d0[0] -poly16x8_t vdupq_lane_p16(poly16x4_t vec, __constrange(0,3) int lane); // VDUP.16 q0,d0[0] -float32x4_t vdupq_lane_f32(float32x2_t vec, __constrange(0,1) int lane); // VDUP.32 q0,d0[0] -int64x1_t vdup_lane_s64(int64x1_t vec, __constrange(0,0) int lane); // VMOV d0,d0 -uint64x1_t vdup_lane_u64(uint64x1_t vec, __constrange(0,0) int lane); // VMOV d0,d0 -int64x2_t vdupq_lane_s64(int64x1_t vec, __constrange(0,0) int lane); // VMOV q0,q0 -uint64x2_t vdupq_lane_u64(uint64x1_t vec, __constrange(0,0) int lane); // VMOV q0,q0 -//Combining vectors. These intrinsics join two 64 bit vectors into a single 128bit vector. -int8x16_t vcombine_s8(int8x8_t low, int8x8_t high); // VMOV d0,d0 -int16x8_t vcombine_s16(int16x4_t low, int16x4_t high); // VMOV d0,d0 -int32x4_t vcombine_s32(int32x2_t low, int32x2_t high); // VMOV d0,d0 -int64x2_t vcombine_s64(int64x1_t low, int64x1_t high); // VMOV d0,d0 -float16x8_t vcombine_f16(float16x4_t low, float16x4_t high); // VMOV d0,d0 -float32x4_t vcombine_f32(float32x2_t low, float32x2_t high); // VMOV d0,d0 -uint8x16_t vcombine_u8(uint8x8_t low, uint8x8_t high); // VMOV d0,d0 -uint16x8_t vcombine_u16(uint16x4_t low, uint16x4_t high); // VMOV d0,d0 -uint32x4_t vcombine_u32(uint32x2_t low, uint32x2_t high); // VMOV d0,d0 -uint64x2_t vcombine_u64(uint64x1_t low, uint64x1_t high); // VMOV d0,d0 -poly8x16_t vcombine_p8(poly8x8_t low, poly8x8_t high); // VMOV d0,d0 -poly16x8_t vcombine_p16(poly16x4_t low, poly16x4_t high); // VMOV d0,d0 -//Splitting vectors. These intrinsics split a 128 bit vector into 2 component 64 bit vectors -int8x8_t vget_high_s8(int8x16_t a); // VMOV d0,d0 -int16x4_t vget_high_s16(int16x8_t a); // VMOV d0,d0 -int32x2_t vget_high_s32(int32x4_t a); // VMOV d0,d0 -int64x1_t vget_high_s64(int64x2_t a); // VMOV d0,d0 -float16x4_t vget_high_f16(float16x8_t a); // VMOV d0,d0 -float32x2_t vget_high_f32(float32x4_t a); // VMOV d0,d0 -uint8x8_t vget_high_u8(uint8x16_t a); // VMOV d0,d0 -uint16x4_t vget_high_u16(uint16x8_t a); // VMOV d0,d0 -uint32x2_t vget_high_u32(uint32x4_t a); // VMOV d0,d0 -uint64x1_t vget_high_u64(uint64x2_t a); // VMOV d0,d0 -poly8x8_t vget_high_p8(poly8x16_t a); // VMOV d0,d0 -poly16x4_t vget_high_p16(poly16x8_t a); // VMOV d0,d0 -int8x8_t vget_low_s8(int8x16_t a); // VMOV d0,d0 -int16x4_t vget_low_s16(int16x8_t a); // VMOV d0,d0 -int32x2_t vget_low_s32(int32x4_t a); // VMOV d0,d0 -int64x1_t vget_low_s64(int64x2_t a); // VMOV d0,d0 -float16x4_t vget_low_f16(float16x8_t a); // VMOV d0,d0 -float32x2_t vget_low_f32(float32x4_t a); // VMOV d0,d0 -uint8x8_t vget_low_u8(uint8x16_t a); // VMOV d0,d0 -uint16x4_t vget_low_u16(uint16x8_t a); // VMOV d0,d0 -uint32x2_t vget_low_u32(uint32x4_t a); // VMOV d0,d0 -uint64x1_t vget_low_u64(uint64x2_t a); // VMOV d0,d0 -poly8x8_t vget_low_p8(poly8x16_t a); // VMOV d0,d0 -poly16x4_t vget_low_p16(poly16x8_t a); // VMOV d0,d0 -//Converting vectors. These intrinsics are used to convert vectors. -//Convert from float -int32x2_t vcvt_s32_f32(float32x2_t a); // VCVT.S32.F32 d0, d0 -uint32x2_t vcvt_u32_f32(float32x2_t a); // VCVT.U32.F32 d0, d0 -int32x4_t vcvtq_s32_f32(float32x4_t a); // VCVT.S32.F32 q0, q0 -uint32x4_t vcvtq_u32_f32(float32x4_t a); // VCVT.U32.F32 q0, q0 -int32x2_t vcvt_n_s32_f32(float32x2_t a, __constrange(1,32) int b); // VCVT.S32.F32 d0, d0, #32 -uint32x2_t vcvt_n_u32_f32(float32x2_t a, __constrange(1,32) int b); // VCVT.U32.F32 d0, d0, #32 -int32x4_t vcvtq_n_s32_f32(float32x4_t a, __constrange(1,32) int b); // VCVT.S32.F32 q0, q0, #32 -uint32x4_t vcvtq_n_u32_f32(float32x4_t a, __constrange(1,32) int b); // VCVT.U32.F32 q0, q0, #32 -//Convert to float -float32x2_t vcvt_f32_s32(int32x2_t a); // VCVT.F32.S32 d0, d0 -float32x2_t vcvt_f32_u32(uint32x2_t a); // VCVT.F32.U32 d0, d0 -float32x4_t vcvtq_f32_s32(int32x4_t a); // VCVT.F32.S32 q0, q0 -float32x4_t vcvtq_f32_u32(uint32x4_t a); // VCVT.F32.U32 q0, q0 -float32x2_t vcvt_n_f32_s32(int32x2_t a, __constrange(1,32) int b); // VCVT.F32.S32 d0, d0, #32 -float32x2_t vcvt_n_f32_u32(uint32x2_t a, __constrange(1,32) int b); // VCVT.F32.U32 d0, d0, #32 -float32x4_t vcvtq_n_f32_s32(int32x4_t a, __constrange(1,32) int b); // VCVT.F32.S32 q0, q0, #32 -float32x4_t vcvtq_n_f32_u32(uint32x4_t a, __constrange(1,32) int b); // VCVT.F32.U32 q0, q0, #32 -//Convert between floats -float16x4_t vcvt_f16_f32(float32x4_t a); // VCVT.F16.F32 d0, q0 -float32x4_t vcvt_f32_f16(float16x4_t a); // VCVT.F32.F16 q0, d0 -//Vector narrow integer -int8x8_t vmovn_s16(int16x8_t a); // VMOVN.I16 d0,q0 -int16x4_t vmovn_s32(int32x4_t a); // VMOVN.I32 d0,q0 -int32x2_t vmovn_s64(int64x2_t a); // VMOVN.I64 d0,q0 -uint8x8_t vmovn_u16(uint16x8_t a); // VMOVN.I16 d0,q0 -uint16x4_t vmovn_u32(uint32x4_t a); // VMOVN.I32 d0,q0 -uint32x2_t vmovn_u64(uint64x2_t a); // VMOVN.I64 d0,q0 -//Vector long move -int16x8_t vmovl_s8(int8x8_t a); // VMOVL.S8 q0,d0 -int32x4_t vmovl_s16(int16x4_t a); // VMOVL.S16 q0,d0 -int64x2_t vmovl_s32(int32x2_t a); // VMOVL.S32 q0,d0 -uint16x8_t vmovl_u8(uint8x8_t a); // VMOVL.U8 q0,d0 -uint32x4_t vmovl_u16(uint16x4_t a); // VMOVL.U16 q0,d0 -uint64x2_t vmovl_u32(uint32x2_t a); // VMOVL.U32 q0,d0 -//Vector saturating narrow integer -int8x8_t vqmovn_s16(int16x8_t a); // VQMOVN.S16 d0,q0 -int16x4_t vqmovn_s32(int32x4_t a); // VQMOVN.S32 d0,q0 -int32x2_t vqmovn_s64(int64x2_t a); // VQMOVN.S64 d0,q0 -uint8x8_t vqmovn_u16(uint16x8_t a); // VQMOVN.U16 d0,q0 -uint16x4_t vqmovn_u32(uint32x4_t a); // VQMOVN.U32 d0,q0 -uint32x2_t vqmovn_u64(uint64x2_t a); // VQMOVN.U64 d0,q0 -//Vector saturating narrow integer signed->unsigned -uint8x8_t vqmovun_s16(int16x8_t a); // VQMOVUN.S16 d0,q0 -uint16x4_t vqmovun_s32(int32x4_t a); // VQMOVUN.S32 d0,q0 -uint32x2_t vqmovun_s64(int64x2_t a); // VQMOVUN.S64 d0,q0 -//Table look up -uint8x8_t vtbl1_u8(uint8x8_t a, uint8x8_t b); // VTBL.8 d0, {d0}, d0 -int8x8_t vtbl1_s8(int8x8_t a, int8x8_t b); // VTBL.8 d0, {d0}, d0 -poly8x8_t vtbl1_p8(poly8x8_t a, uint8x8_t b); // VTBL.8 d0, {d0}, d0 -uint8x8_t vtbl2_u8_ptr(uint8x8x2_t *a, uint8x8_t b); // VTBL.8 d0, {d0, d1}, d0 -int8x8_t vtbl2_s8_ptr(int8x8x2_t *a, int8x8_t b); // VTBL.8 d0, {d0, d1}, d0 -poly8x8_t vtbl2_p8_ptr(poly8x8x2_t *a, uint8x8_t b); // VTBL.8 d0, {d0, d1}, d0 -uint8x8_t vtbl3_u8_ptr(uint8x8x3_t *a, uint8x8_t b); // VTBL.8 d0, {d0, d1, d2}, d0 -int8x8_t vtbl3_s8_ptr(int8x8x3_t *a, int8x8_t b); // VTBL.8 d0, {d0, d1, d2}, d0 -poly8x8_t vtbl3_p8_ptr(poly8x8x3_t *a, uint8x8_t b); // VTBL.8 d0, {d0, d1, d2}, d0 -uint8x8_t vtbl4_u8_ptr(uint8x8x4_t *a, uint8x8_t b); // VTBL.8 d0, {d0, d1, d2, d3}, d0 -int8x8_t vtbl4_s8_ptr(int8x8x4_t *a, int8x8_t b); // VTBL.8 d0, {d0, d1, d2, d3}, d0 -poly8x8_t vtbl4_p8_ptr(poly8x8x4_t *a, uint8x8_t b); // VTBL.8 d0, {d0, d1, d2, d3}, d0 -//Extended table look up intrinsics -uint8x8_t vtbx1_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c); // VTBX.8 d0, {d0}, d0 -int8x8_t vtbx1_s8(int8x8_t a, int8x8_t b, int8x8_t c); // VTBX.8 d0, {d0}, d0 -poly8x8_t vtbx1_p8(poly8x8_t a, poly8x8_t b, uint8x8_t c); // VTBX.8 d0, {d0}, d0 -uint8x8_t vtbx2_u8_ptr(uint8x8_t a, uint8x8x2_t *b, uint8x8_t c); // VTBX.8 d0, {d0, d1}, d0 -int8x8_t vtbx2_s8_ptr(int8x8_t a, int8x8x2_t *b, int8x8_t c); // VTBX.8 d0, {d0, d1}, d0 -poly8x8_t vtbx2_p8_ptr(poly8x8_t a, poly8x8x2_t *b, uint8x8_t c); // VTBX.8 d0, {d0, d1}, d0 -uint8x8_t vtbx3_u8_ptr(uint8x8_t a, uint8x8x3_t *b, uint8x8_t c); // VTBX.8 d0, {d0, d1, d2}, d0 -int8x8_t vtbx3_s8_ptr(int8x8_t a, int8x8x3_t *b, int8x8_t c); // VTBX.8 d0, {d0, d1, d2}, d0 -poly8x8_t vtbx3_p8_ptr(poly8x8_t a, poly8x8x3_t *b, uint8x8_t c); // VTBX.8 d0, {d0, d1, d2}, d0 -uint8x8_t vtbx4_u8_ptr(uint8x8_t a, uint8x8x4_t *b, uint8x8_t c); // VTBX.8 d0, {d0, d1, d2, d3}, d0 -int8x8_t vtbx4_s8_ptr(int8x8_t a, int8x8x4_t *b, int8x8_t c); // VTBX.8 d0, {d0, d1, d2, d3}, d0 -poly8x8_t vtbx4_p8_ptr(poly8x8_t a, poly8x8x4_t *b, uint8x8_t c); // VTBX.8 d0, {d0, d1, d2, d3}, d0 -//Operations with a scalar value -//Vector multiply accumulate with scalar -int16x4_t vmla_lane_s16(int16x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); // VMLA.I16 d0, d0,d0[0] -int32x2_t vmla_lane_s32(int32x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); // VMLA.I32 d0, d0,d0[0] -uint16x4_t vmla_lane_u16(uint16x4_t a, uint16x4_t b, uint16x4_t v, __constrange(0,3) int l); // VMLA.I16 d0, d0,d0[0] -uint32x2_t vmla_lane_u32(uint32x2_t a, uint32x2_t b, uint32x2_t v, __constrange(0,1) int l); // VMLA.I32 d0, d0,d0[0] -float32x2_t vmla_lane_f32(float32x2_t a, float32x2_t b, float32x2_t v, __constrange(0,1) int l); // VMLA.F32 d0,d0, d0[0] -int16x8_t vmlaq_lane_s16(int16x8_t a, int16x8_t b, int16x4_t v, __constrange(0,3) int l); // VMLA.I16 q0, q0,d0[0] -int32x4_t vmlaq_lane_s32(int32x4_t a, int32x4_t b, int32x2_t v, __constrange(0,1) int l); // VMLA.I32 q0, q0,d0[0] -uint16x8_t vmlaq_lane_u16(uint16x8_t a, uint16x8_t b, uint16x4_t v, __constrange(0,3) int l); // VMLA.I16 q0,q0, d0[0] -uint32x4_t vmlaq_lane_u32(uint32x4_t a, uint32x4_t b, uint32x2_t v, __constrange(0,1) int l); // VMLA.I32 q0,q0, d0[0] -float32x4_t vmlaq_lane_f32(float32x4_t a, float32x4_t b, float32x2_t v, __constrange(0,1) int l); // VMLA.F32 q0,q0, d0[0] -//Vector widening multiply accumulate with scalar -int32x4_t vmlal_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); //VMLAL.S16 q0, d0,d0[0] -int64x2_t vmlal_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); //VMLAL.S32 q0, d0,d0[0] -uint32x4_t vmlal_lane_u16(uint32x4_t a, uint16x4_t b, uint16x4_t v, __constrange(0,3) int l); // VMLAL.U16 q0,d0, d0[0] -uint64x2_t vmlal_lane_u32(uint64x2_t a, uint32x2_t b, uint32x2_t v, __constrange(0,1) int l); // VMLAL.U32 q0,d0, d0[0] -//Vector widening saturating doubling multiply accumulate with scalar -int32x4_t vqdmlal_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); // VQDMLAL.S16 q0,d0, d0[0] -int64x2_t vqdmlal_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); // VQDMLAL.S32 q0,d0, d0[0] -//Vector multiply subtract with scalar -int16x4_t vmls_lane_s16(int16x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); // VMLS.I16 d0, d0,d0[0] -int32x2_t vmls_lane_s32(int32x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); // VMLS.I32 d0, d0,d0[0] -uint16x4_t vmls_lane_u16(uint16x4_t a, uint16x4_t b, uint16x4_t v, __constrange(0,3) int l); // VMLS.I16 d0, d0,d0[0] -uint32x2_t vmls_lane_u32(uint32x2_t a, uint32x2_t b, uint32x2_t v, __constrange(0,1) int l); // VMLS.I32 d0, d0,d0[0] -float32x2_t vmls_lane_f32(float32x2_t a, float32x2_t b, float32x2_t v, __constrange(0,1) int l); // VMLS.F32 d0,d0, d0[0] -int16x8_t vmlsq_lane_s16(int16x8_t a, int16x8_t b, int16x4_t v, __constrange(0,3) int l); // VMLS.I16 q0, q0,d0[0] -int32x4_t vmlsq_lane_s32(int32x4_t a, int32x4_t b, int32x2_t v, __constrange(0,1) int l); // VMLS.I32 q0, q0,d0[0] -uint16x8_t vmlsq_lane_u16(uint16x8_t a, uint16x8_t b, uint16x4_t v, __constrange(0,3) int l); // VMLS.I16 q0,q0, d0[0] -uint32x4_t vmlsq_lane_u32(uint32x4_t a, uint32x4_t b, uint32x2_t v, __constrange(0,1) int l); // VMLS.I32 q0,q0, d0[0] -float32x4_t vmlsq_lane_f32(float32x4_t a, float32x4_t b, float32x2_t v, __constrange(0,1) int l); // VMLS.F32 q0,q0, d0[0] -//Vector widening multiply subtract with scalar -int32x4_t vmlsl_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); // VMLSL.S16 q0, d0,d0[0] -int64x2_t vmlsl_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); // VMLSL.S32 q0, d0,d0[0] -uint32x4_t vmlsl_lane_u16(uint32x4_t a, uint16x4_t b, uint16x4_t v, __constrange(0,3) int l); // VMLSL.U16 q0,d0, d0[0] -uint64x2_t vmlsl_lane_u32(uint64x2_t a, uint32x2_t b, uint32x2_t v, __constrange(0,1) int l); // VMLSL.U32 q0,d0, d0[0] -//Vector widening saturating doubling multiply subtract with scalar -int32x4_t vqdmlsl_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); // VQDMLSL.S16 q0,d0, d0[0] -int64x2_t vqdmlsl_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); // VQDMLSL.S32 q0,d0, d0[0] -//Vector multiply by scalar -int16x4_t vmul_n_s16(int16x4_t a, int16_t b); // VMUL.I16 d0,d0,d0[0] -int32x2_t vmul_n_s32(int32x2_t a, int32_t b); // VMUL.I32 d0,d0,d0[0] -float32x2_t vmul_n_f32(float32x2_t a, float32_t b); // VMUL.F32 d0,d0,d0[0] -uint16x4_t vmul_n_u16(uint16x4_t a, uint16_t b); // VMUL.I16 d0,d0,d0[0] -uint32x2_t vmul_n_u32(uint32x2_t a, uint32_t b); // VMUL.I32 d0,d0,d0[0] -int16x8_t vmulq_n_s16(int16x8_t a, int16_t b); // VMUL.I16 q0,q0,d0[0] -int32x4_t vmulq_n_s32(int32x4_t a, int32_t b); // VMUL.I32 q0,q0,d0[0] -float32x4_t vmulq_n_f32(float32x4_t a, float32_t b); // VMUL.F32 q0,q0,d0[0] -uint16x8_t vmulq_n_u16(uint16x8_t a, uint16_t b); // VMUL.I16 q0,q0,d0[0] -uint32x4_t vmulq_n_u32(uint32x4_t a, uint32_t b); // VMUL.I32 q0,q0,d0[0] -//Vector long multiply with scalar -int32x4_t vmull_n_s16(int16x4_t vec1, int16_t val2); // VMULL.S16 q0,d0,d0[0] -int64x2_t vmull_n_s32(int32x2_t vec1, int32_t val2); // VMULL.S32 q0,d0,d0[0] -uint32x4_t vmull_n_u16(uint16x4_t vec1, uint16_t val2); // VMULL.U16 q0,d0,d0[0] -uint64x2_t vmull_n_u32(uint32x2_t vec1, uint32_t val2); // VMULL.U32 q0,d0,d0[0] -//Vector long multiply by scalar -int32x4_t vmull_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VMULL.S16 q0,d0,d0[0] -int64x2_t vmull_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VMULL.S32 q0,d0,d0[0] -uint32x4_t vmull_lane_u16(uint16x4_t vec1, uint16x4_t val2, __constrange(0, 3) int val3); // VMULL.U16 q0,d0,d0[0] -uint64x2_t vmull_lane_u32(uint32x2_t vec1, uint32x2_t val2, __constrange(0, 1) int val3); // VMULL.U32 q0,d0,d0[0] -//Vector saturating doubling long multiply with scalar -int32x4_t vqdmull_n_s16(int16x4_t vec1, int16_t val2); // VQDMULL.S16 q0,d0,d0[0] -int64x2_t vqdmull_n_s32(int32x2_t vec1, int32_t val2); // VQDMULL.S32 q0,d0,d0[0] -//Vector saturating doubling long multiply by scalar -int32x4_t vqdmull_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VQDMULL.S16 q0,d0,d0[0] -int64x2_t vqdmull_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VQDMULL.S32 q0,d0,d0[0] -//Vector saturating doubling multiply high with scalar -int16x4_t vqdmulh_n_s16(int16x4_t vec1, int16_t val2); // VQDMULH.S16 d0,d0,d0[0] -int32x2_t vqdmulh_n_s32(int32x2_t vec1, int32_t val2); // VQDMULH.S32 d0,d0,d0[0] -int16x8_t vqdmulhq_n_s16(int16x8_t vec1, int16_t val2); // VQDMULH.S16 q0,q0,d0[0] -int32x4_t vqdmulhq_n_s32(int32x4_t vec1, int32_t val2); // VQDMULH.S32 q0,q0,d0[0] -//Vector saturating doubling multiply high by scalar -int16x4_t vqdmulh_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VQDMULH.S16 d0,d0,d0[0] -int32x2_t vqdmulh_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VQDMULH.S32 d0,d0,d0[0] -int16x8_t vqdmulhq_lane_s16(int16x8_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VQDMULH.S16 q0,q0,d0[0] -int32x4_t vqdmulhq_lane_s32(int32x4_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VQDMULH.S32 q0,q0,d0[0] -//Vector saturating rounding doubling multiply high with scalar -int16x4_t vqrdmulh_n_s16(int16x4_t vec1, int16_t val2); // VQRDMULH.S16 d0,d0,d0[0] -int32x2_t vqrdmulh_n_s32(int32x2_t vec1, int32_t val2); // VQRDMULH.S32 d0,d0,d0[0] -int16x8_t vqrdmulhq_n_s16(int16x8_t vec1, int16_t val2); // VQRDMULH.S16 q0,q0,d0[0] -int32x4_t vqrdmulhq_n_s32(int32x4_t vec1, int32_t val2); // VQRDMULH.S32 q0,q0,d0[0] -//Vector rounding saturating doubling multiply high by scalar -int16x4_t vqrdmulh_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VQRDMULH.S16 d0,d0,d0[0] -int32x2_t vqrdmulh_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VQRDMULH.S32 d0,d0,d0[0] -int16x8_t vqrdmulhq_lane_s16(int16x8_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VQRDMULH.S16 q0,q0,d0[0] -int32x4_t vqrdmulhq_lane_s32(int32x4_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VQRDMULH.S32 q0,q0,d0[0] -//Vector multiply accumulate with scalar -int16x4_t vmla_n_s16(int16x4_t a, int16x4_t b, int16_t c); // VMLA.I16 d0, d0, d0[0] -int32x2_t vmla_n_s32(int32x2_t a, int32x2_t b, int32_t c); // VMLA.I32 d0, d0, d0[0] -uint16x4_t vmla_n_u16(uint16x4_t a, uint16x4_t b, uint16_t c); // VMLA.I16 d0, d0, d0[0] -uint32x2_t vmla_n_u32(uint32x2_t a, uint32x2_t b, uint32_t c); // VMLA.I32 d0, d0, d0[0] -float32x2_t vmla_n_f32(float32x2_t a, float32x2_t b, float32_t c); // VMLA.F32 d0, d0, d0[0] -int16x8_t vmlaq_n_s16(int16x8_t a, int16x8_t b, int16_t c); // VMLA.I16 q0, q0, d0[0] -int32x4_t vmlaq_n_s32(int32x4_t a, int32x4_t b, int32_t c); // VMLA.I32 q0, q0, d0[0] -uint16x8_t vmlaq_n_u16(uint16x8_t a, uint16x8_t b, uint16_t c); // VMLA.I16 q0, q0, d0[0] -uint32x4_t vmlaq_n_u32(uint32x4_t a, uint32x4_t b, uint32_t c); // VMLA.I32 q0, q0, d0[0] -float32x4_t vmlaq_n_f32(float32x4_t a, float32x4_t b, float32_t c); // VMLA.F32 q0, q0, d0[0] -//Vector widening multiply accumulate with scalar -int32x4_t vmlal_n_s16(int32x4_t a, int16x4_t b, int16_t c); // VMLAL.S16 q0, d0, d0[0] -int64x2_t vmlal_n_s32(int64x2_t a, int32x2_t b, int32_t c); // VMLAL.S32 q0, d0, d0[0] -uint32x4_t vmlal_n_u16(uint32x4_t a, uint16x4_t b, uint16_t c); // VMLAL.U16 q0, d0, d0[0] -uint64x2_t vmlal_n_u32(uint64x2_t a, uint32x2_t b, uint32_t c); // VMLAL.U32 q0, d0, d0[0] -//Vector widening saturating doubling multiply accumulate with scalar -int32x4_t vqdmlal_n_s16(int32x4_t a, int16x4_t b, int16_t c); // VQDMLAL.S16 q0, d0, d0[0] -int64x2_t vqdmlal_n_s32(int64x2_t a, int32x2_t b, int32_t c); // VQDMLAL.S32 q0, d0, d0[0] -//Vector multiply subtract with scalar -int16x4_t vmls_n_s16(int16x4_t a, int16x4_t b, int16_t c); // VMLS.I16 d0, d0, d0[0] -int32x2_t vmls_n_s32(int32x2_t a, int32x2_t b, int32_t c); // VMLS.I32 d0, d0, d0[0] -uint16x4_t vmls_n_u16(uint16x4_t a, uint16x4_t b, uint16_t c); // VMLS.I16 d0, d0, d0[0] -uint32x2_t vmls_n_u32(uint32x2_t a, uint32x2_t b, uint32_t c); // VMLS.I32 d0, d0, d0[0] -float32x2_t vmls_n_f32(float32x2_t a, float32x2_t b, float32_t c); // VMLS.F32 d0, d0, d0[0] -int16x8_t vmlsq_n_s16(int16x8_t a, int16x8_t b, int16_t c); // VMLS.I16 q0, q0, d0[0] -int32x4_t vmlsq_n_s32(int32x4_t a, int32x4_t b, int32_t c); // VMLS.I32 q0, q0, d0[0] -uint16x8_t vmlsq_n_u16(uint16x8_t a, uint16x8_t b, uint16_t c); // VMLS.I16 q0, q0, d0[0] -uint32x4_t vmlsq_n_u32(uint32x4_t a, uint32x4_t b, uint32_t c); // VMLS.I32 q0, q0, d0[0] -float32x4_t vmlsq_n_f32(float32x4_t a, float32x4_t b, float32_t c); // VMLS.F32 q0, q0, d0[0] -//Vector widening multiply subtract with scalar -int32x4_t vmlsl_n_s16(int32x4_t a, int16x4_t b, int16_t c); // VMLSL.S16 q0, d0, d0[0] -int64x2_t vmlsl_n_s32(int64x2_t a, int32x2_t b, int32_t c); // VMLSL.S32 q0, d0, d0[0] -uint32x4_t vmlsl_n_u16(uint32x4_t a, uint16x4_t b, uint16_t c); // VMLSL.U16 q0, d0, d0[0] -uint64x2_t vmlsl_n_u32(uint64x2_t a, uint32x2_t b, uint32_t c); // VMLSL.U32 q0, d0, d0[0] -//Vector widening saturating doubling multiply subtract with scalar -int32x4_t vqdmlsl_n_s16(int32x4_t a, int16x4_t b, int16_t c); // VQDMLSL.S16 q0, d0, d0[0] -int64x2_t vqdmlsl_n_s32(int64x2_t a, int32x2_t b, int32_t c); // VQDMLSL.S32 q0, d0, d0[0] -//Vector extract -int8x8_t vext_s8(int8x8_t a, int8x8_t b, __constrange(0,7) int c); // VEXT.8 d0,d0,d0,#0 -uint8x8_t vext_u8(uint8x8_t a, uint8x8_t b, __constrange(0,7) int c); // VEXT.8 d0,d0,d0,#0 -poly8x8_t vext_p8(poly8x8_t a, poly8x8_t b, __constrange(0,7) int c); // VEXT.8 d0,d0,d0,#0 -int16x4_t vext_s16(int16x4_t a, int16x4_t b, __constrange(0,3) int c); // VEXT.16 d0,d0,d0,#0 -uint16x4_t vext_u16(uint16x4_t a, uint16x4_t b, __constrange(0,3) int c); // VEXT.16 d0,d0,d0,#0 -poly16x4_t vext_p16(poly16x4_t a, poly16x4_t b, __constrange(0,3) int c); // VEXT.16 d0,d0,d0,#0 -int32x2_t vext_s32(int32x2_t a, int32x2_t b, __constrange(0,1) int c); // VEXT.32 d0,d0,d0,#0 -uint32x2_t vext_u32(uint32x2_t a, uint32x2_t b, __constrange(0,1) int c); // VEXT.32 d0,d0,d0,#0 -int64x1_t vext_s64(int64x1_t a, int64x1_t b, __constrange(0,0) int c); // VEXT.64 d0,d0,d0,#0 -uint64x1_t vext_u64(uint64x1_t a, uint64x1_t b, __constrange(0,0) int c); // VEXT.64 d0,d0,d0,#0 -float32x2_t vext_f32(float32x2_t a, float32x2_t b, __constrange(0,1) int c); // VEXT.32 d0,d0,d0,#0 -int8x16_t vextq_s8(int8x16_t a, int8x16_t b, __constrange(0,15) int c); // VEXT.8 q0,q0,q0,#0 -uint8x16_t vextq_u8(uint8x16_t a, uint8x16_t b, __constrange(0,15) int c); // VEXT.8 q0,q0,q0,#0 -poly8x16_t vextq_p8(poly8x16_t a, poly8x16_t b, __constrange(0,15) int c); // VEXT.8 q0,q0,q0,#0 -int16x8_t vextq_s16(int16x8_t a, int16x8_t b, __constrange(0,7) int c); // VEXT.16 q0,q0,q0,#0 -uint16x8_t vextq_u16(uint16x8_t a, uint16x8_t b, __constrange(0,7) int c); // VEXT.16 q0,q0,q0,#0 -poly16x8_t vextq_p16(poly16x8_t a, poly16x8_t b, __constrange(0,7) int c); // VEXT.16 q0,q0,q0,#0 -int32x4_t vextq_s32(int32x4_t a, int32x4_t b, __constrange(0,3) int c); // VEXT.32 q0,q0,q0,#0 -uint32x4_t vextq_u32(uint32x4_t a, uint32x4_t b, __constrange(0,3) int c); // VEXT.32 q0,q0,q0,#0 -int64x2_t vextq_s64(int64x2_t a, int64x2_t b, __constrange(0,1) int c); // VEXT.64 q0,q0,q0,#0 -uint64x2_t vextq_u64(uint64x2_t a, uint64x2_t b, __constrange(0,1) int c); // VEXT.64 q0,q0,q0,#0 -float32x4_t vextq_f32(float32x4_t a, float32x4_t b, __constrange(0,3) float c); // VEXT.32 q0,q0,q0,#0 -//Reverse vector elements (swap endianness). VREVn.m reverses the order of the m-bit lanes within a set that is n bits wide. -int8x8_t vrev64_s8(int8x8_t vec); // VREV64.8 d0,d0 -int16x4_t vrev64_s16(int16x4_t vec); // VREV64.16 d0,d0 -int32x2_t vrev64_s32(int32x2_t vec); // VREV64.32 d0,d0 -uint8x8_t vrev64_u8(uint8x8_t vec); // VREV64.8 d0,d0 -uint16x4_t vrev64_u16(uint16x4_t vec); // VREV64.16 d0,d0 -uint32x2_t vrev64_u32(uint32x2_t vec); // VREV64.32 d0,d0 -poly8x8_t vrev64_p8(poly8x8_t vec); // VREV64.8 d0,d0 -poly16x4_t vrev64_p16(poly16x4_t vec); // VREV64.16 d0,d0 -float32x2_t vrev64_f32(float32x2_t vec); // VREV64.32 d0,d0 -int8x16_t vrev64q_s8(int8x16_t vec); // VREV64.8 q0,q0 -int16x8_t vrev64q_s16(int16x8_t vec); // VREV64.16 q0,q0 -int32x4_t vrev64q_s32(int32x4_t vec); // VREV64.32 q0,q0 -uint8x16_t vrev64q_u8(uint8x16_t vec); // VREV64.8 q0,q0 -uint16x8_t vrev64q_u16(uint16x8_t vec); // VREV64.16 q0,q0 -uint32x4_t vrev64q_u32(uint32x4_t vec); // VREV64.32 q0,q0 -poly8x16_t vrev64q_p8(poly8x16_t vec); // VREV64.8 q0,q0 -poly16x8_t vrev64q_p16(poly16x8_t vec); // VREV64.16 q0,q0 -float32x4_t vrev64q_f32(float32x4_t vec); // VREV64.32 q0,q0 -int8x8_t vrev32_s8(int8x8_t vec); // VREV32.8 d0,d0 -int16x4_t vrev32_s16(int16x4_t vec); // VREV32.16 d0,d0 -uint8x8_t vrev32_u8(uint8x8_t vec); // VREV32.8 d0,d0 -uint16x4_t vrev32_u16(uint16x4_t vec); // VREV32.16 d0,d0 -poly8x8_t vrev32_p8(poly8x8_t vec); // VREV32.8 d0,d0 -poly16x4_t vrev32_p16(poly16x4_t vec); // VREV32.16 d0,d0 -int8x16_t vrev32q_s8(int8x16_t vec); // VREV32.8 q0,q0 -int16x8_t vrev32q_s16(int16x8_t vec); // VREV32.16 q0,q0 -uint8x16_t vrev32q_u8(uint8x16_t vec); // VREV32.8 q0,q0 -uint16x8_t vrev32q_u16(uint16x8_t vec); // VREV32.16 q0,q0 -poly8x16_t vrev32q_p8(poly8x16_t vec); // VREV32.8 q0,q0 -poly16x8_t vrev32q_p16(poly16x8_t vec); // VREV32.16 q0,q0 -int8x8_t vrev16_s8(int8x8_t vec); // VREV16.8 d0,d0 -uint8x8_t vrev16_u8(uint8x8_t vec); // VREV16.8 d0,d0 -poly8x8_t vrev16_p8(poly8x8_t vec); // VREV16.8 d0,d0 -int8x16_t vrev16q_s8(int8x16_t vec); // VREV16.8 q0,q0 -uint8x16_t vrev16q_u8(uint8x16_t vec); // VREV16.8 q0,q0 -poly8x16_t vrev16q_p8(poly8x16_t vec); // VREV16.8 q0,q0 -//Other single operand arithmetic -//Absolute: Vd[i] = |Va[i]| -int8x8_t vabs_s8(int8x8_t a); // VABS.S8 d0,d0 -int16x4_t vabs_s16(int16x4_t a); // VABS.S16 d0,d0 -int32x2_t vabs_s32(int32x2_t a); // VABS.S32 d0,d0 -float32x2_t vabs_f32(float32x2_t a); // VABS.F32 d0,d0 -int8x16_t vabsq_s8(int8x16_t a); // VABS.S8 q0,q0 -int16x8_t vabsq_s16(int16x8_t a); // VABS.S16 q0,q0 -int32x4_t vabsq_s32(int32x4_t a); // VABS.S32 q0,q0 -float32x4_t vabsq_f32(float32x4_t a); // VABS.F32 q0,q0 -//Saturating absolute: Vd[i] = sat(|Va[i]|) -int8x8_t vqabs_s8(int8x8_t a); // VQABS.S8 d0,d0 -int16x4_t vqabs_s16(int16x4_t a); // VQABS.S16 d0,d0 -int32x2_t vqabs_s32(int32x2_t a); // VQABS.S32 d0,d0 -int8x16_t vqabsq_s8(int8x16_t a); // VQABS.S8 q0,q0 -int16x8_t vqabsq_s16(int16x8_t a); // VQABS.S16 q0,q0 -int32x4_t vqabsq_s32(int32x4_t a); // VQABS.S32 q0,q0 -//Negate: Vd[i] = - Va[i] -int8x8_t vneg_s8(int8x8_t a); // VNE//d0,d0 -int16x4_t vneg_s16(int16x4_t a); // VNE//d0,d0 -int32x2_t vneg_s32(int32x2_t a); // VNE//d0,d0 -float32x2_t vneg_f32(float32x2_t a); // VNE//d0,d0 -int8x16_t vnegq_s8(int8x16_t a); // VNE//q0,q0 -int16x8_t vnegq_s16(int16x8_t a); // VNE//q0,q0 -int32x4_t vnegq_s32(int32x4_t a); // VNE//q0,q0 -float32x4_t vnegq_f32(float32x4_t a); // VNE//q0,q0 -//Saturating Negate: sat(Vd[i] = - Va[i]) -int8x8_t vqneg_s8(int8x8_t a); // VQNE//d0,d0 -int16x4_t vqneg_s16(int16x4_t a); // VQNE//d0,d0 -int32x2_t vqneg_s32(int32x2_t a); // VQNE//d0,d0 -int8x16_t vqnegq_s8(int8x16_t a); // VQNE//q0,q0 -int16x8_t vqnegq_s16(int16x8_t a); // VQNE//q0,q0 -int32x4_t vqnegq_s32(int32x4_t a); // VQNE//q0,q0 -//Count leading sign bits -int8x8_t vcls_s8(int8x8_t a); // VCLS.S8 d0,d0 -int16x4_t vcls_s16(int16x4_t a); // VCLS.S16 d0,d0 -int32x2_t vcls_s32(int32x2_t a); // VCLS.S32 d0,d0 -int8x16_t vclsq_s8(int8x16_t a); // VCLS.S8 q0,q0 -int16x8_t vclsq_s16(int16x8_t a); // VCLS.S16 q0,q0 -int32x4_t vclsq_s32(int32x4_t a); // VCLS.S32 q0,q0 -//Count leading zeros -int8x8_t vclz_s8(int8x8_t a); // VCLZ.I8 d0,d0 -int16x4_t vclz_s16(int16x4_t a); // VCLZ.I16 d0,d0 -int32x2_t vclz_s32(int32x2_t a); // VCLZ.I32 d0,d0 -uint8x8_t vclz_u8(uint8x8_t a); // VCLZ.I8 d0,d0 -uint16x4_t vclz_u16(uint16x4_t a); // VCLZ.I16 d0,d0 -uint32x2_t vclz_u32(uint32x2_t a); // VCLZ.I32 d0,d0 -int8x16_t vclzq_s8(int8x16_t a); // VCLZ.I8 q0,q0 -int16x8_t vclzq_s16(int16x8_t a); // VCLZ.I16 q0,q0 -int32x4_t vclzq_s32(int32x4_t a); // VCLZ.I32 q0,q0 -uint8x16_t vclzq_u8(uint8x16_t a); // VCLZ.I8 q0,q0 -uint16x8_t vclzq_u16(uint16x8_t a); // VCLZ.I16 q0,q0 -uint32x4_t vclzq_u32(uint32x4_t a); // VCLZ.I32 q0,q0 -//Count number of set bits -uint8x8_t vcnt_u8(uint8x8_t a); // VCNT.8 d0,d0 -int8x8_t vcnt_s8(int8x8_t a); // VCNT.8 d0,d0 -poly8x8_t vcnt_p8(poly8x8_t a); // VCNT.8 d0,d0 -uint8x16_t vcntq_u8(uint8x16_t a); // VCNT.8 q0,q0 -int8x16_t vcntq_s8(int8x16_t a); // VCNT.8 q0,q0 -poly8x16_t vcntq_p8(poly8x16_t a); // VCNT.8 q0,q0 -//Reciprocal estimate -float32x2_t vrecpe_f32(float32x2_t a); // VRECPE.F32 d0,d0 -uint32x2_t vrecpe_u32(uint32x2_t a); // VRECPE.U32 d0,d0 -float32x4_t vrecpeq_f32(float32x4_t a); // VRECPE.F32 q0,q0 -uint32x4_t vrecpeq_u32(uint32x4_t a); // VRECPE.U32 q0,q0 -//Reciprocal square root estimate -float32x2_t vrsqrte_f32(float32x2_t a); // VRSQRTE.F32 d0,d0 -uint32x2_t vrsqrte_u32(uint32x2_t a); // VRSQRTE.U32 d0,d0 -float32x4_t vrsqrteq_f32(float32x4_t a); // VRSQRTE.F32 q0,q0 -uint32x4_t vrsqrteq_u32(uint32x4_t a); // VRSQRTE.U32 q0,q0 -//Logical operations -//Bitwise not -int8x8_t vmvn_s8(int8x8_t a); // VMVN d0,d0 -int16x4_t vmvn_s16(int16x4_t a); // VMVN d0,d0 -int32x2_t vmvn_s32(int32x2_t a); // VMVN d0,d0 -uint8x8_t vmvn_u8(uint8x8_t a); // VMVN d0,d0 -uint16x4_t vmvn_u16(uint16x4_t a); // VMVN d0,d0 -uint32x2_t vmvn_u32(uint32x2_t a); // VMVN d0,d0 -poly8x8_t vmvn_p8(poly8x8_t a); // VMVN d0,d0 -int8x16_t vmvnq_s8(int8x16_t a); // VMVN q0,q0 -int16x8_t vmvnq_s16(int16x8_t a); // VMVN q0,q0 -int32x4_t vmvnq_s32(int32x4_t a); // VMVN q0,q0 -uint8x16_t vmvnq_u8(uint8x16_t a); // VMVN q0,q0 -uint16x8_t vmvnq_u16(uint16x8_t a); // VMVN q0,q0 -uint32x4_t vmvnq_u32(uint32x4_t a); // VMVN q0,q0 -poly8x16_t vmvnq_p8(poly8x16_t a); // VMVN q0,q0 -//Bitwise and -int8x8_t vand_s8(int8x8_t a, int8x8_t b); // VAND d0,d0,d0 -int16x4_t vand_s16(int16x4_t a, int16x4_t b); // VAND d0,d0,d0 -int32x2_t vand_s32(int32x2_t a, int32x2_t b); // VAND d0,d0,d0 -int64x1_t vand_s64(int64x1_t a, int64x1_t b); // VAND d0,d0,d0 -uint8x8_t vand_u8(uint8x8_t a, uint8x8_t b); // VAND d0,d0,d0 -uint16x4_t vand_u16(uint16x4_t a, uint16x4_t b); // VAND d0,d0,d0 -uint32x2_t vand_u32(uint32x2_t a, uint32x2_t b); // VAND d0,d0,d0 -uint64x1_t vand_u64(uint64x1_t a, uint64x1_t b); // VAND d0,d0,d0 -int8x16_t vandq_s8(int8x16_t a, int8x16_t b); // VAND q0,q0,q0 -int16x8_t vandq_s16(int16x8_t a, int16x8_t b); // VAND q0,q0,q0 -int32x4_t vandq_s32(int32x4_t a, int32x4_t b); // VAND q0,q0,q0 -int64x2_t vandq_s64(int64x2_t a, int64x2_t b); // VAND q0,q0,q0 -uint8x16_t vandq_u8(uint8x16_t a, uint8x16_t b); // VAND q0,q0,q0 -uint16x8_t vandq_u16(uint16x8_t a, uint16x8_t b); // VAND q0,q0,q0 -uint32x4_t vandq_u32(uint32x4_t a, uint32x4_t b); // VAND q0,q0,q0 -uint64x2_t vandq_u64(uint64x2_t a, uint64x2_t b); // VAND q0,q0,q0 -//Bitwise or -int8x8_t vorr_s8(int8x8_t a, int8x8_t b); // VORR d0,d0,d0 -int16x4_t vorr_s16(int16x4_t a, int16x4_t b); // VORR d0,d0,d0 -int32x2_t vorr_s32(int32x2_t a, int32x2_t b); // VORR d0,d0,d0 -int64x1_t vorr_s64(int64x1_t a, int64x1_t b); // VORR d0,d0,d0 -uint8x8_t vorr_u8(uint8x8_t a, uint8x8_t b); // VORR d0,d0,d0 -uint16x4_t vorr_u16(uint16x4_t a, uint16x4_t b); // VORR d0,d0,d0 -uint32x2_t vorr_u32(uint32x2_t a, uint32x2_t b); // VORR d0,d0,d0 -uint64x1_t vorr_u64(uint64x1_t a, uint64x1_t b); // VORR d0,d0,d0 -int8x16_t vorrq_s8(int8x16_t a, int8x16_t b); // VORR q0,q0,q0 -int16x8_t vorrq_s16(int16x8_t a, int16x8_t b); // VORR q0,q0,q0 -int32x4_t vorrq_s32(int32x4_t a, int32x4_t b); // VORR q0,q0,q0 -int64x2_t vorrq_s64(int64x2_t a, int64x2_t b); // VORR q0,q0,q0 -uint8x16_t vorrq_u8(uint8x16_t a, uint8x16_t b); // VORR q0,q0,q0 -uint16x8_t vorrq_u16(uint16x8_t a, uint16x8_t b); // VORR q0,q0,q0 -uint32x4_t vorrq_u32(uint32x4_t a, uint32x4_t b); // VORR q0,q0,q0 -uint64x2_t vorrq_u64(uint64x2_t a, uint64x2_t b); // VORR q0,q0,q0 -//Bitwise exclusive or (EOR or XOR) -int8x8_t veor_s8(int8x8_t a, int8x8_t b); // VEOR d0,d0,d0 -int16x4_t veor_s16(int16x4_t a, int16x4_t b); // VEOR d0,d0,d0 -int32x2_t veor_s32(int32x2_t a, int32x2_t b); // VEOR d0,d0,d0 -int64x1_t veor_s64(int64x1_t a, int64x1_t b); // VEOR d0,d0,d0 -uint8x8_t veor_u8(uint8x8_t a, uint8x8_t b); // VEOR d0,d0,d0 -uint16x4_t veor_u16(uint16x4_t a, uint16x4_t b); // VEOR d0,d0,d0 -uint32x2_t veor_u32(uint32x2_t a, uint32x2_t b); // VEOR d0,d0,d0 -uint64x1_t veor_u64(uint64x1_t a, uint64x1_t b); // VEOR d0,d0,d0 -int8x16_t veorq_s8(int8x16_t a, int8x16_t b); // VEOR q0,q0,q0 -int16x8_t veorq_s16(int16x8_t a, int16x8_t b); // VEOR q0,q0,q0 -int32x4_t veorq_s32(int32x4_t a, int32x4_t b); // VEOR q0,q0,q0 -int64x2_t veorq_s64(int64x2_t a, int64x2_t b); // VEOR q0,q0,q0 -uint8x16_t veorq_u8(uint8x16_t a, uint8x16_t b); // VEOR q0,q0,q0 -uint16x8_t veorq_u16(uint16x8_t a, uint16x8_t b); // VEOR q0,q0,q0 -uint32x4_t veorq_u32(uint32x4_t a, uint32x4_t b); // VEOR q0,q0,q0 -uint64x2_t veorq_u64(uint64x2_t a, uint64x2_t b); // VEOR q0,q0,q0 -//Bit Clear -int8x8_t vbic_s8(int8x8_t a, int8x8_t b); // VBIC d0,d0,d0 -int16x4_t vbic_s16(int16x4_t a, int16x4_t b); // VBIC d0,d0,d0 -int32x2_t vbic_s32(int32x2_t a, int32x2_t b); // VBIC d0,d0,d0 -int64x1_t vbic_s64(int64x1_t a, int64x1_t b); // VBIC d0,d0,d0 -uint8x8_t vbic_u8(uint8x8_t a, uint8x8_t b); // VBIC d0,d0,d0 -uint16x4_t vbic_u16(uint16x4_t a, uint16x4_t b); // VBIC d0,d0,d0 -uint32x2_t vbic_u32(uint32x2_t a, uint32x2_t b); // VBIC d0,d0,d0 -uint64x1_t vbic_u64(uint64x1_t a, uint64x1_t b); // VBIC d0,d0,d0 -int8x16_t vbicq_s8(int8x16_t a, int8x16_t b); // VBIC q0,q0,q0 -int16x8_t vbicq_s16(int16x8_t a, int16x8_t b); // VBIC q0,q0,q0 -int32x4_t vbicq_s32(int32x4_t a, int32x4_t b); // VBIC q0,q0,q0 -int64x2_t vbicq_s64(int64x2_t a, int64x2_t b); // VBIC q0,q0,q0 -uint8x16_t vbicq_u8(uint8x16_t a, uint8x16_t b); // VBIC q0,q0,q0 -uint16x8_t vbicq_u16(uint16x8_t a, uint16x8_t b); // VBIC q0,q0,q0 -uint32x4_t vbicq_u32(uint32x4_t a, uint32x4_t b); // VBIC q0,q0,q0 -uint64x2_t vbicq_u64(uint64x2_t a, uint64x2_t b); // VBIC q0,q0,q0 -//Bitwise OR complement -int8x8_t vorn_s8(int8x8_t a, int8x8_t b); // VORN d0,d0,d0 -int16x4_t vorn_s16(int16x4_t a, int16x4_t b); // VORN d0,d0,d0 -int32x2_t vorn_s32(int32x2_t a, int32x2_t b); // VORN d0,d0,d0 -int64x1_t vorn_s64(int64x1_t a, int64x1_t b); // VORN d0,d0,d0 -uint8x8_t vorn_u8(uint8x8_t a, uint8x8_t b); // VORN d0,d0,d0 -uint16x4_t vorn_u16(uint16x4_t a, uint16x4_t b); // VORN d0,d0,d0 -uint32x2_t vorn_u32(uint32x2_t a, uint32x2_t b); // VORN d0,d0,d0 -uint64x1_t vorn_u64(uint64x1_t a, uint64x1_t b); // VORN d0,d0,d0 -int8x16_t vornq_s8(int8x16_t a, int8x16_t b); // VORN q0,q0,q0 -int16x8_t vornq_s16(int16x8_t a, int16x8_t b); // VORN q0,q0,q0 -int32x4_t vornq_s32(int32x4_t a, int32x4_t b); // VORN q0,q0,q0 -int64x2_t vornq_s64(int64x2_t a, int64x2_t b); // VORN q0,q0,q0 -uint8x16_t vornq_u8(uint8x16_t a, uint8x16_t b); // VORN q0,q0,q0 -uint16x8_t vornq_u16(uint16x8_t a, uint16x8_t b); // VORN q0,q0,q0 -uint32x4_t vornq_u32(uint32x4_t a, uint32x4_t b); // VORN q0,q0,q0 -uint64x2_t vornq_u64(uint64x2_t a, uint64x2_t b); // VORN q0,q0,q0 -//Bitwise Select -int8x8_t vbsl_s8(uint8x8_t a, int8x8_t b, int8x8_t c); // VBSL d0,d0,d0 -int16x4_t vbsl_s16(uint16x4_t a, int16x4_t b, int16x4_t c); // VBSL d0,d0,d0 -int32x2_t vbsl_s32(uint32x2_t a, int32x2_t b, int32x2_t c); // VBSL d0,d0,d0 -int64x1_t vbsl_s64(uint64x1_t a, int64x1_t b, int64x1_t c); // VBSL d0,d0,d0 -uint8x8_t vbsl_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c); // VBSL d0,d0,d0 -uint16x4_t vbsl_u16(uint16x4_t a, uint16x4_t b, uint16x4_t c); // VBSL d0,d0,d0 -uint32x2_t vbsl_u32(uint32x2_t a, uint32x2_t b, uint32x2_t c); // VBSL d0,d0,d0 -uint64x1_t vbsl_u64(uint64x1_t a, uint64x1_t b, uint64x1_t c); // VBSL d0,d0,d0 -float32x2_t vbsl_f32(uint32x2_t a, float32x2_t b, float32x2_t c); // VBSL d0,d0,d0 -poly8x8_t vbsl_p8(uint8x8_t a, poly8x8_t b, poly8x8_t c); // VBSL d0,d0,d0 -poly16x4_t vbsl_p16(uint16x4_t a, poly16x4_t b, poly16x4_t c); // VBSL d0,d0,d0 -int8x16_t vbslq_s8(uint8x16_t a, int8x16_t b, int8x16_t c); // VBSL q0,q0,q0 -int16x8_t vbslq_s16(uint16x8_t a, int16x8_t b, int16x8_t c); // VBSL q0,q0,q0 -int32x4_t vbslq_s32(uint32x4_t a, int32x4_t b, int32x4_t c); // VBSL q0,q0,q0 -int64x2_t vbslq_s64(uint64x2_t a, int64x2_t b, int64x2_t c); // VBSL q0,q0,q0 -uint8x16_t vbslq_u8(uint8x16_t a, uint8x16_t b, uint8x16_t c); // VBSL q0,q0,q0 -uint16x8_t vbslq_u16(uint16x8_t a, uint16x8_t b, uint16x8_t c); // VBSL q0,q0,q0 -uint32x4_t vbslq_u32(uint32x4_t a, uint32x4_t b, uint32x4_t c); // VBSL q0,q0,q0 -uint64x2_t vbslq_u64(uint64x2_t a, uint64x2_t b, uint64x2_t c); // VBSL q0,q0,q0 -float32x4_t vbslq_f32(uint32x4_t a, float32x4_t b, float32x4_t c); // VBSL q0,q0,q0 -poly8x16_t vbslq_p8(uint8x16_t a, poly8x16_t b, poly8x16_t c); // VBSL q0,q0,q0 -poly16x8_t vbslq_p16(uint16x8_t a, poly16x8_t b, poly16x8_t c); // VBSL q0,q0,q0 -//Transposition operations -//Transpose elements -int8x8x2_t vtrn_s8(int8x8_t a, int8x8_t b); // VTRN.8 d0,d0 -int16x4x2_t vtrn_s16(int16x4_t a, int16x4_t b); // VTRN.16 d0,d0 -int32x2x2_t vtrn_s32(int32x2_t a, int32x2_t b); // VTRN.32 d0,d0 -uint8x8x2_t vtrn_u8(uint8x8_t a, uint8x8_t b); // VTRN.8 d0,d0 -uint16x4x2_t vtrn_u16(uint16x4_t a, uint16x4_t b); // VTRN.16 d0,d0 -uint32x2x2_t vtrn_u32(uint32x2_t a, uint32x2_t b); // VTRN.32 d0,d0 -float32x2x2_t vtrn_f32(float32x2_t a, float32x2_t b); // VTRN.32 d0,d0 -poly8x8x2_t vtrn_p8(poly8x8_t a, poly8x8_t b); // VTRN.8 d0,d0 -poly16x4x2_t vtrn_p16(poly16x4_t a, poly16x4_t b); // VTRN.16 d0,d0 -int8x16x2_t vtrnq_s8(int8x16_t a, int8x16_t b); // VTRN.8 q0,q0 -int16x8x2_t vtrnq_s16(int16x8_t a, int16x8_t b); // VTRN.16 q0,q0 -int32x4x2_t vtrnq_s32(int32x4_t a, int32x4_t b); // VTRN.32 q0,q0 -uint8x16x2_t vtrnq_u8(uint8x16_t a, uint8x16_t b); // VTRN.8 q0,q0 -uint16x8x2_t vtrnq_u16(uint16x8_t a, uint16x8_t b); // VTRN.16 q0,q0 -uint32x4x2_t vtrnq_u32(uint32x4_t a, uint32x4_t b); // VTRN.32 q0,q0 -float32x4x2_t vtrnq_f32(float32x4_t a, float32x4_t b); // VTRN.32 q0,q0 -poly8x16x2_t vtrnq_p8(poly8x16_t a, poly8x16_t b); // VTRN.8 q0,q0 -poly16x8x2_t vtrnq_p16(poly16x8_t a, poly16x8_t b); // VTRN.16 q0,q0 -//Interleave elements -int8x8x2_t vzip_s8(int8x8_t a, int8x8_t b); // VZIP.8 d0,d0 -int16x4x2_t vzip_s16(int16x4_t a, int16x4_t b); // VZIP.16 d0,d0 -int32x2x2_t vzip_s32(int32x2_t a, int32x2_t b); // VZIP.32 d0,d0 -uint8x8x2_t vzip_u8(uint8x8_t a, uint8x8_t b); // VZIP.8 d0,d0 -uint16x4x2_t vzip_u16(uint16x4_t a, uint16x4_t b); // VZIP.16 d0,d0 -uint32x2x2_t vzip_u32(uint32x2_t a, uint32x2_t b); // VZIP.32 d0,d0 -float32x2x2_t vzip_f32(float32x2_t a, float32x2_t b); // VZIP.32 d0,d0 -poly8x8x2_t vzip_p8(poly8x8_t a, poly8x8_t b); // VZIP.8 d0,d0 -poly16x4x2_t vzip_p16(poly16x4_t a, poly16x4_t b); // VZIP.16 d0,d0 -int8x16x2_t vzipq_s8(int8x16_t a, int8x16_t b); // VZIP.8 q0,q0 -int16x8x2_t vzipq_s16(int16x8_t a, int16x8_t b); // VZIP.16 q0,q0 -int32x4x2_t vzipq_s32(int32x4_t a, int32x4_t b); // VZIP.32 q0,q0 -uint8x16x2_t vzipq_u8(uint8x16_t a, uint8x16_t b); // VZIP.8 q0,q0 -uint16x8x2_t vzipq_u16(uint16x8_t a, uint16x8_t b); // VZIP.16 q0,q0 -uint32x4x2_t vzipq_u32(uint32x4_t a, uint32x4_t b); // VZIP.32 q0,q0 -float32x4x2_t vzipq_f32(float32x4_t a, float32x4_t b); // VZIP.32 q0,q0 -poly8x16x2_t vzipq_p8(poly8x16_t a, poly8x16_t b); // VZIP.8 q0,q0 -poly16x8x2_t vzipq_p16(poly16x8_t a, poly16x8_t b); // VZIP.16 q0,q0 -//De-Interleave elements -int8x8x2_t vuzp_s8(int8x8_t a, int8x8_t b); // VUZP.8 d0,d0 -int16x4x2_t vuzp_s16(int16x4_t a, int16x4_t b); // VUZP.16 d0,d0 -int32x2x2_t vuzp_s32(int32x2_t a, int32x2_t b); // VUZP.32 d0,d0 -uint8x8x2_t vuzp_u8(uint8x8_t a, uint8x8_t b); // VUZP.8 d0,d0 -uint16x4x2_t vuzp_u16(uint16x4_t a, uint16x4_t b); // VUZP.16 d0,d0 -uint32x2x2_t vuzp_u32(uint32x2_t a, uint32x2_t b); // VUZP.32 d0,d0 -float32x2x2_t vuzp_f32(float32x2_t a, float32x2_t b); // VUZP.32 d0,d0 -poly8x8x2_t vuzp_p8(poly8x8_t a, poly8x8_t b); // VUZP.8 d0,d0 -poly16x4x2_t vuzp_p16(poly16x4_t a, poly16x4_t b); // VUZP.16 d0,d0 -int8x16x2_t vuzpq_s8(int8x16_t a, int8x16_t b); // VUZP.8 q0,q0 -int16x8x2_t vuzpq_s16(int16x8_t a, int16x8_t b); // VUZP.16 q0,q0 -int32x4x2_t vuzpq_s32(int32x4_t a, int32x4_t b); // VUZP.32 q0,q0 -uint8x16x2_t vuzpq_u8(uint8x16_t a, uint8x16_t b); // VUZP.8 q0,q0 -uint16x8x2_t vuzpq_u16(uint16x8_t a, uint16x8_t b); // VUZP.16 q0,q0 -uint32x4x2_t vuzpq_u32(uint32x4_t a, uint32x4_t b); // VUZP.32 q0,q0 -float32x4x2_t vuzpq_f32(float32x4_t a, float32x4_t b); // VUZP.32 q0,q0 -poly8x16x2_t vuzpq_p8(poly8x16_t a, poly8x16_t b); // VUZP.8 q0,q0 -poly16x8x2_t vuzpq_p16(poly16x8_t a, poly16x8_t b); // VUZP.16 q0,q0 - - -//^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -// the following macros solve the problem of the "immediate parameters requirement" for some x86 intrinsics. While for release build it is not a must, -//for debug build we need it to compile the code unless the "Intrinsic parameter must be an immediate value" error is our goal -// -#if ( ((defined _MSC_VER) && (_MSC_VER > 1600)) || defined (__INTEL_COMPILER) )&& defined NDEBUG //if it is a release build, we also need it to fix the issue for VS2010 and earlier compilers. - - #define _MM_ALIGNR_EPI8 _mm_alignr_epi8 - - #define _MM_EXTRACT_EPI16 _mm_extract_epi16 - #define _MM_INSERT_EPI16 _mm_insert_epi16 -#ifdef USE_SSE4 - #define _MM_EXTRACT_EPI8 _mm_extract_epi8 - #define _MM_EXTRACT_EPI32 _mm_extract_epi32 - #define _MM_EXTRACT_PS _mm_extract_ps - - #define _MM_INSERT_EPI8 _mm_insert_epi8 - #define _MM_INSERT_EPI32 _mm_insert_epi32 - #define _MM_INSERT_PS _mm_insert_ps -#ifdef _NEON2SSE_64BIT - #define _MM_INSERT_EPI64 _mm_insert_epi64 - #define _MM_EXTRACT_EPI64 _mm_extract_epi64 -#endif -#endif //SSE4 -#else - #define _NEON2SSE_COMMA , - #define _NEON2SSE_SWITCH16(NAME, a, b, LANE) \ - switch(LANE) \ - { \ - case 0: return NAME(a b, 0); \ - case 1: return NAME(a b, 1); \ - case 2: return NAME(a b, 2); \ - case 3: return NAME(a b, 3); \ - case 4: return NAME(a b, 4); \ - case 5: return NAME(a b, 5); \ - case 6: return NAME(a b, 6); \ - case 7: return NAME(a b, 7); \ - case 8: return NAME(a b, 8); \ - case 9: return NAME(a b, 9); \ - case 10: return NAME(a b, 10); \ - case 11: return NAME(a b, 11); \ - case 12: return NAME(a b, 12); \ - case 13: return NAME(a b, 13); \ - case 14: return NAME(a b, 14); \ - case 15: return NAME(a b, 15); \ - default: return NAME(a b, 0); \ - } - - #define _NEON2SSE_SWITCH8(NAME, vec, LANE, p) \ - switch(LANE) \ - { \ - case 0: return NAME(vec p,0); \ - case 1: return NAME(vec p,1); \ - case 2: return NAME(vec p,2); \ - case 3: return NAME(vec p,3); \ - case 4: return NAME(vec p,4); \ - case 5: return NAME(vec p,5); \ - case 6: return NAME(vec p,6); \ - case 7: return NAME(vec p,7); \ - default: return NAME(vec p,0); \ - } - - #define _NEON2SSE_SWITCH4(NAME, case0, case1, case2, case3, vec, LANE, p) \ - switch(LANE) \ - { \ - case case0: return NAME(vec p,case0); \ - case case1: return NAME(vec p,case1); \ - case case2: return NAME(vec p,case2); \ - case case3: return NAME(vec p,case3); \ - default: return NAME(vec p,case0); \ - } - - _NEON2SSE_INLINE __m128i _MM_ALIGNR_EPI8(__m128i a, __m128i b, int LANE) - { - _NEON2SSE_SWITCH16(_mm_alignr_epi8, a, _NEON2SSE_COMMA b, LANE) - } - - _NEON2SSE_INLINE __m128i _MM_INSERT_EPI16(__m128i vec, int p, const int LANE) - { - _NEON2SSE_SWITCH8(_mm_insert_epi16, vec, LANE, _NEON2SSE_COMMA p) - } - - _NEON2SSE_INLINE int _MM_EXTRACT_EPI16(__m128i vec, const int LANE) - { - _NEON2SSE_SWITCH8(_mm_extract_epi16, vec, LANE,) - } - -#ifdef USE_SSE4 - _NEON2SSE_INLINE int _MM_EXTRACT_EPI32(__m128i vec, const int LANE) - { - _NEON2SSE_SWITCH4(_mm_extract_epi32, 0,1,2,3, vec, LANE,) - } - - _NEON2SSE_INLINE int _MM_EXTRACT_PS(__m128 vec, const int LANE) - { - _NEON2SSE_SWITCH4(_mm_extract_ps, 0,1,2,3, vec, LANE,) - } - - _NEON2SSE_INLINE int _MM_EXTRACT_EPI8(__m128i vec, const int LANE) - { - _NEON2SSE_SWITCH16(_mm_extract_epi8, vec, , LANE) - } - - _NEON2SSE_INLINE __m128i _MM_INSERT_EPI32(__m128i vec, int p, const int LANE) - { - _NEON2SSE_SWITCH4(_mm_insert_epi32, 0, 1, 2, 3, vec, LANE, _NEON2SSE_COMMA p) - } - - _NEON2SSE_INLINE __m128i _MM_INSERT_EPI8(__m128i vec, int p, const int LANE) - { - _NEON2SSE_SWITCH16(_mm_insert_epi8, vec, _NEON2SSE_COMMA p, LANE) - } - -#ifdef _NEON2SSE_64BIT - //the special case of functions available only for SSE4 and 64-bit build. - _NEON2SSE_INLINE __m128i _MM_INSERT_EPI64(__m128i vec, int p, const int LANE) - { - switch(LANE) { - case 0: - return _mm_insert_epi64(vec, p, 0); - case 1: - return _mm_insert_epi64(vec, p, 1); - default: - return _mm_insert_epi64(vec, p, 0); - } - } - - _NEON2SSE_INLINE int64_t _MM_EXTRACT_EPI64(__m128i val, const int LANE) - { - if (LANE ==0) return _mm_extract_epi64(val, 0); - else return _mm_extract_epi64(val, 1); - } -#endif - - _NEON2SSE_INLINE __m128 _MM_INSERT_PS(__m128 vec, __m128 p, const int LANE) - { - _NEON2SSE_SWITCH4(_mm_insert_ps, 0, 16, 32, 48, vec, LANE, _NEON2SSE_COMMA p) - } - -#endif //USE_SSE4 - -#endif //#ifdef NDEBUG - -//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -// Below are some helper functions used either for SSE4 intrinsics "emulation" for SSSE3 limited devices -// or for some specific commonly used operations implementation missing in SSE -#ifdef USE_SSE4 - #define _MM_CVTEPU8_EPI16 _mm_cvtepu8_epi16 - #define _MM_CVTEPU16_EPI32 _mm_cvtepu16_epi32 - #define _MM_CVTEPU32_EPI64 _mm_cvtepu32_epi64 - - #define _MM_CVTEPI8_EPI16 _mm_cvtepi8_epi16 - #define _MM_CVTEPI16_EPI32 _mm_cvtepi16_epi32 - #define _MM_CVTEPI32_EPI64 _mm_cvtepi32_epi64 - - #define _MM_MAX_EPI8 _mm_max_epi8 - #define _MM_MAX_EPI32 _mm_max_epi32 - #define _MM_MAX_EPU16 _mm_max_epu16 - #define _MM_MAX_EPU32 _mm_max_epu32 - - #define _MM_MIN_EPI8 _mm_min_epi8 - #define _MM_MIN_EPI32 _mm_min_epi32 - #define _MM_MIN_EPU16 _mm_min_epu16 - #define _MM_MIN_EPU32 _mm_min_epu32 - - #define _MM_BLENDV_EPI8 _mm_blendv_epi8 - #define _MM_PACKUS_EPI32 _mm_packus_epi32 - #define _MM_PACKUS1_EPI32(a) _mm_packus_epi32(a, a) - - #define _MM_MULLO_EPI32 _mm_mullo_epi32 - #define _MM_MUL_EPI32 _mm_mul_epi32 - - #define _MM_CMPEQ_EPI64 _mm_cmpeq_epi64 -#else //no SSE4 !!!!!! - _NEON2SSE_INLINE __m128i _MM_CVTEPU8_EPI16(__m128i a) - { - __m128i zero = _mm_setzero_si128(); - return _mm_unpacklo_epi8(a, zero); - } - - _NEON2SSE_INLINE __m128i _MM_CVTEPU16_EPI32(__m128i a) - { - __m128i zero = _mm_setzero_si128(); - return _mm_unpacklo_epi16(a, zero); - } - - _NEON2SSE_INLINE __m128i _MM_CVTEPU32_EPI64(__m128i a) - { - __m128i zero = _mm_setzero_si128(); - return _mm_unpacklo_epi32(a, zero); - } - - _NEON2SSE_INLINE __m128i _MM_CVTEPI8_EPI16(__m128i a) - { - __m128i zero = _mm_setzero_si128(); - __m128i sign = _mm_cmpgt_epi8(zero, a); - return _mm_unpacklo_epi8(a, sign); - } - - _NEON2SSE_INLINE __m128i _MM_CVTEPI16_EPI32(__m128i a) - { - __m128i zero = _mm_setzero_si128(); - __m128i sign = _mm_cmpgt_epi16(zero, a); - return _mm_unpacklo_epi16(a, sign); - } - - _NEON2SSE_INLINE __m128i _MM_CVTEPI32_EPI64(__m128i a) - { - __m128i zero = _mm_setzero_si128(); - __m128i sign = _mm_cmpgt_epi32(zero, a); - return _mm_unpacklo_epi32(a, sign); - } - - _NEON2SSE_INLINE int _MM_EXTRACT_EPI32(__m128i vec, const int LANE) - { - _NEON2SSE_ALIGN_16 int32_t tmp[4]; - _mm_store_si128((__m128i*)tmp, vec); - return tmp[LANE]; - } - - _NEON2SSE_INLINE int _MM_EXTRACT_EPI8(__m128i vec, const int LANE) - { - _NEON2SSE_ALIGN_16 int8_t tmp[16]; - _mm_store_si128((__m128i*)tmp, vec); - return (int)tmp[LANE]; - } - - _NEON2SSE_INLINE int _MM_EXTRACT_PS(__m128 vec, const int LANE) - { - _NEON2SSE_ALIGN_16 int32_t tmp[4]; - _mm_store_si128((__m128i*)tmp, _M128i(vec)); - return tmp[LANE]; - } - - _NEON2SSE_INLINE __m128i _MM_INSERT_EPI32(__m128i vec, int p, const int LANE) - { - _NEON2SSE_ALIGN_16 int32_t pvec[4] = {0,0,0,0}; - _NEON2SSE_ALIGN_16 uint32_t mask[4] = {0xffffffff,0xffffffff,0xffffffff,0xffffffff}; - __m128i vec_masked, p_masked; - pvec[LANE] = p; - mask[LANE] = 0x0; - vec_masked = _mm_and_si128 (*(__m128i*)mask,vec); //ready for p - p_masked = _mm_andnot_si128 (*(__m128i*)mask,*(__m128i*)pvec); //ready for vec - return _mm_or_si128(vec_masked, p_masked); - } - - _NEON2SSE_INLINE __m128i _MM_INSERT_EPI8(__m128i vec, int p, const int LANE) - { - _NEON2SSE_ALIGN_16 int8_t pvec[16] = {0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0}; - _NEON2SSE_ALIGN_16 uint8_t mask[16] = {0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff}; - __m128i vec_masked, p_masked; - pvec[LANE] = (int8_t)p; - mask[LANE] = 0x0; - vec_masked = _mm_and_si128 (*(__m128i*)mask,vec); //ready for p - p_masked = _mm_andnot_si128 (*(__m128i*)mask,*(__m128i*)pvec); //ready for vec - return _mm_or_si128(vec_masked, p_masked); - } - - _NEON2SSE_INLINE __m128 _MM_INSERT_PS(__m128 vec, __m128 p, const int LANE) - { - _NEON2SSE_ALIGN_16 int32_t mask[4] = {0xffffffff,0xffffffff,0xffffffff,0xffffffff}; - __m128 tmp, vec_masked, p_masked; - mask[LANE >> 4] = 0x0; //here the LANE is not actural lane, need to deal with it - vec_masked = _mm_and_ps (*(__m128*)mask,vec); //ready for p - p_masked = _mm_andnot_ps (*(__m128*)mask, p); //ready for vec - tmp = _mm_or_ps(vec_masked, p_masked); - return tmp; - } - - _NEON2SSE_INLINE __m128i _MM_MAX_EPI8(__m128i a, __m128i b) - { - __m128i cmp, resa, resb; - cmp = _mm_cmpgt_epi8 (a, b); - resa = _mm_and_si128 (cmp, a); - resb = _mm_andnot_si128 (cmp,b); - return _mm_or_si128(resa, resb); - } - - _NEON2SSE_INLINE __m128i _MM_MAX_EPI32(__m128i a, __m128i b) - { - __m128i cmp, resa, resb; - cmp = _mm_cmpgt_epi32(a, b); - resa = _mm_and_si128 (cmp, a); - resb = _mm_andnot_si128 (cmp,b); - return _mm_or_si128(resa, resb); - } - - _NEON2SSE_INLINE __m128i _MM_MAX_EPU16(__m128i a, __m128i b) - { - __m128i c8000, b_s, a_s, cmp; - c8000 = _mm_cmpeq_epi16 (a,a); //0xffff - c8000 = _mm_slli_epi16 (c8000, 15); //0x8000 - b_s = _mm_sub_epi16 (b, c8000); - a_s = _mm_sub_epi16 (a, c8000); - cmp = _mm_cmpgt_epi16 (a_s, b_s); //no unsigned comparison, need to go to signed - a_s = _mm_and_si128 (cmp,a); - b_s = _mm_andnot_si128 (cmp,b); - return _mm_or_si128(a_s, b_s); - } - - _NEON2SSE_INLINE __m128i _MM_MAX_EPU32(__m128i a, __m128i b) - { - __m128i c80000000, b_s, a_s, cmp; - c80000000 = _mm_cmpeq_epi32 (a,a); //0xffffffff - c80000000 = _mm_slli_epi32 (c80000000, 31); //0x80000000 - b_s = _mm_sub_epi32 (b, c80000000); - a_s = _mm_sub_epi32 (a, c80000000); - cmp = _mm_cmpgt_epi32 (a_s, b_s); //no unsigned comparison, need to go to signed - a_s = _mm_and_si128 (cmp,a); - b_s = _mm_andnot_si128 (cmp,b); - return _mm_or_si128(a_s, b_s); - } - - _NEON2SSE_INLINE __m128i _MM_MIN_EPI8(__m128i a, __m128i b) - { - __m128i cmp, resa, resb; - cmp = _mm_cmpgt_epi8 (b, a); - resa = _mm_and_si128 (cmp, a); - resb = _mm_andnot_si128 (cmp,b); - return _mm_or_si128(resa, resb); - } - - _NEON2SSE_INLINE __m128i _MM_MIN_EPI32(__m128i a, __m128i b) - { - __m128i cmp, resa, resb; - cmp = _mm_cmpgt_epi32(b, a); - resa = _mm_and_si128 (cmp, a); - resb = _mm_andnot_si128 (cmp,b); - return _mm_or_si128(resa, resb); - } - - _NEON2SSE_INLINE __m128i _MM_MIN_EPU16(__m128i a, __m128i b) - { - __m128i c8000, b_s, a_s, cmp; - c8000 = _mm_cmpeq_epi16 (a,a); //0xffff - c8000 = _mm_slli_epi16 (c8000, 15); //0x8000 - b_s = _mm_sub_epi16 (b, c8000); - a_s = _mm_sub_epi16 (a, c8000); - cmp = _mm_cmpgt_epi16 (b_s, a_s); //no unsigned comparison, need to go to signed - a_s = _mm_and_si128 (cmp,a); - b_s = _mm_andnot_si128 (cmp,b); - return _mm_or_si128(a_s, b_s); - } - - _NEON2SSE_INLINE __m128i _MM_MIN_EPU32(__m128i a, __m128i b) - { - __m128i c80000000, b_s, a_s, cmp; - c80000000 = _mm_cmpeq_epi32 (a,a); //0xffffffff - c80000000 = _mm_slli_epi32 (c80000000, 31); //0x80000000 - b_s = _mm_sub_epi32 (b, c80000000); - a_s = _mm_sub_epi32 (a, c80000000); - cmp = _mm_cmpgt_epi32 (b_s, a_s); //no unsigned comparison, need to go to signed - a_s = _mm_and_si128 (cmp,a); - b_s = _mm_andnot_si128 (cmp,b); - return _mm_or_si128(a_s, b_s); - } - - _NEON2SSE_INLINE __m128i _MM_BLENDV_EPI8(__m128i a, __m128i b, __m128i mask) //this is NOT exact implementation of _mm_blendv_epi8 !!!!! - please see below - { - //it assumes mask is either 0xff or 0 always (like in all usecases below) while for the original _mm_blendv_epi8 only MSB mask byte matters. - __m128i a_masked, b_masked; - b_masked = _mm_and_si128 (mask,b); //use b if mask 0xff - a_masked = _mm_andnot_si128 (mask,a); - return _mm_or_si128(a_masked, b_masked); - } - - _NEON2SSE_INLINE __m128i _MM_PACKUS_EPI32(__m128i a, __m128i b) - { - _NEON2SSE_ALIGN_16 int8_t mask8_32_even_odd[16] = { 0,1, 4,5, 8,9, 12,13, 2,3, 6,7,10,11,14,15}; - __m128i a16, b16, res, reshi,cmp, zero; - zero = _mm_setzero_si128(); - a16 = _mm_shuffle_epi8 (a, *(__m128i*) mask8_32_even_odd); - b16 = _mm_shuffle_epi8 (b, *(__m128i*) mask8_32_even_odd); - res = _mm_unpacklo_epi64(a16, b16); //result without saturation - reshi = _mm_unpackhi_epi64(a16, b16); //hi part of result used for saturation - cmp = _mm_cmpgt_epi16(zero, reshi); //if cmp<0 the result should be zero - res = _mm_andnot_si128(cmp,res); //if cmp zero - do nothing, otherwise cmp <0 and the result is 0 - cmp = _mm_cmpgt_epi16(reshi,zero); //if cmp positive - return _mm_or_si128(res, cmp); //if cmp positive we are out of 16bits need to saturaate to 0xffff - } - - _NEON2SSE_INLINE __m128i _MM_PACKUS1_EPI32(__m128i a) - { - _NEON2SSE_ALIGN_16 int8_t mask8_32_even_odd[16] = { 0,1, 4,5, 8,9, 12,13, 2,3, 6,7,10,11,14,15}; - __m128i a16, res, reshi,cmp, zero; - zero = _mm_setzero_si128(); - a16 = _mm_shuffle_epi8 (a, *(__m128i*)mask8_32_even_odd); - reshi = _mm_unpackhi_epi64(a16, a16); //hi part of result used for saturation - cmp = _mm_cmpgt_epi16(zero, reshi); //if cmp<0 the result should be zero - res = _mm_andnot_si128(cmp, a16); //if cmp zero - do nothing, otherwise cmp <0 and the result is 0 - cmp = _mm_cmpgt_epi16(reshi,zero); //if cmp positive - return _mm_or_si128(res, cmp); //if cmp positive we are out of 16bits need to saturaate to 0xffff - } - - - _NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(__m128i _MM_MULLO_EPI32(__m128i a, __m128i b), _NEON2SSE_REASON_SLOW_SERIAL) - { - _NEON2SSE_ALIGN_16 int32_t atmp[4], btmp[4], res[4]; - int64_t res64; - int i; - _mm_store_si128((__m128i*)atmp, a); - _mm_store_si128((__m128i*)btmp, b); - for (i = 0; i<4; i++) { - res64 = atmp[i] * btmp[i]; - res[i] = (int)(res64 & 0xffffffff); - } - return _mm_load_si128((__m128i*)res); - } - - _NEON2SSE_INLINE __m128i _MM_MUL_EPI32(__m128i a, __m128i b) - { - __m128i sign, zero, mul_us, a_neg, b_neg, mul_us_neg; - sign = _mm_xor_si128 (a, b); - sign = _mm_srai_epi32 (sign, 31); //promote sign bit to all fields, all fff if negative and all 0 if positive - zero = _mm_setzero_si128(); - a_neg = _mm_abs_epi32 (a); //negate a and b - b_neg = _mm_abs_epi32 (b); //negate a and b - mul_us = _mm_mul_epu32 (a_neg, b_neg); //uses 0 and 2nd data lanes, (abs), the multiplication gives 64 bit result - mul_us_neg = _mm_sub_epi64(zero, mul_us); - mul_us_neg = _mm_and_si128(sign, mul_us_neg); - mul_us = _mm_andnot_si128(sign, mul_us); - return _mm_or_si128 (mul_us, mul_us_neg); - } - - _NEON2SSE_INLINE __m128i _MM_CMPEQ_EPI64(__m128i a, __m128i b) - { - __m128i res; - res = _mm_cmpeq_epi32 (a, b); - return _mm_shuffle_epi32 (res, 1 | (1 << 2) | (3 << 4) | (3 << 6)); //copy the information from hi to low part of the 64 bit data - } -#endif //SSE4 - -//the special case of functions working only for 32 bits, no SSE4 -_NEON2SSE_INLINE __m128i _MM_INSERT_EPI64_32(__m128i vec, int p, const int LANE) -{ - _NEON2SSE_ALIGN_16 uint64_t pvec[2] = {0,0}; - _NEON2SSE_ALIGN_16 uint64_t mask[2] = {0xffffffffffffffff, 0xffffffffffffffff}; - __m128i vec_masked, p_masked; - pvec[LANE] = p; - mask[LANE] = 0x0; - vec_masked = _mm_and_si128 (*(__m128i*)mask,vec); //ready for p - p_masked = _mm_andnot_si128 (*(__m128i*)mask,*(__m128i*)pvec); //ready for vec - return _mm_or_si128(vec_masked, p_masked); -} - -_NEON2SSE_INLINE int64_t _MM_EXTRACT_EPI64_32(__m128i val, const int LANE) -{ - _NEON2SSE_ALIGN_16 int64_t tmp[2]; - _mm_store_si128((__m128i*)tmp, val); - return tmp[LANE]; -} - -#ifndef _NEON2SSE_64BIT_SSE4 - #define _MM_INSERT_EPI64 _MM_INSERT_EPI64_32 - #define _MM_EXTRACT_EPI64 _MM_EXTRACT_EPI64_32 -#endif - -int32x4_t vqd_s32(int32x4_t a); //Doubling saturation for signed ints -_NEON2SSE_INLINE int32x4_t vqd_s32(int32x4_t a) -{ - //Overflow happens only if a and sum have the opposite signs - __m128i c7fffffff, res, res_sat, res_xor_a; - c7fffffff = _mm_set1_epi32(0x7fffffff); - res = _mm_slli_epi32 (a, 1); // res = a*2 - res_sat = _mm_srli_epi32(a, 31); - res_sat = _mm_add_epi32(res_sat, c7fffffff); - res_xor_a = _mm_xor_si128(res, a); - res_xor_a = _mm_srai_epi32(res_xor_a,31); //propagate the sigh bit, all ffff if <0 all ones otherwise - res_sat = _mm_and_si128(res_xor_a, res_sat); - res = _mm_andnot_si128(res_xor_a, res); - return _mm_or_si128(res, res_sat); -} - - -//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -//************************************************************************* -//************************************************************************* -//***************** Functions redefinition\implementatin starts here ***** -//************************************************************************* -//************************************************************************* -//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - -/*If the unified intrinsics solutions is necessary please define your SSE intrinsics wrap here like in the following sample: -#ifdef ARM -#define vector_addq_s32 _mm_add_epi32 -#else //if we have IA -#define vector_addq_s32 vadd_s32 -#endif - -******************************************************************************************** -Functions below are organised in the following way: - -Each NEON intrinsic function has one of the following options: -1. its x86 full equivalent SSE intrinsic - in this case x86 version just follows the NEON one under the corresponding #define statement -2. x86 implementation using more than one x86 intrinsics. In this case it is shaped as inlined C function with return statement -3. the reference to the NEON function returning the same result and implemented in x86 as above. In this case it is shaped as matching NEON function definition -4. for about 5% of functions due to the corresponding x86 SIMD unavailability or inefficiency in terms of performance -the serial implementation is provided along with the corresponding compiler warning. If these functions are on your app critical path -- please consider such functions removal from your code. -*/ - -//*********************************************************************** -//************************ Vector add ***************************** -//*********************************************************************** -int8x8_t vadd_s8(int8x8_t a, int8x8_t b); // VADD.I8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vadd_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(_mm_add_epi8(_pM128i(a),_pM128i(b))); -} - - -int16x4_t vadd_s16(int16x4_t a, int16x4_t b); // VADD.I16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vadd_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(_mm_add_epi16(_pM128i(a),_pM128i(b))); -} - - -int32x2_t vadd_s32(int32x2_t a, int32x2_t b); // VADD.I32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vadd_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(_mm_add_epi32(_pM128i(a),_pM128i(b))); -} - - -int64x1_t vadd_s64(int64x1_t a, int64x1_t b); // VADD.I64 d0,d0,d0 -_NEON2SSE_INLINE int64x1_t vadd_s64(int64x1_t a, int64x1_t b) -{ - int64x1_t res64; - res64.m64_i64[0] = a.m64_i64[0] + b.m64_i64[0]; - return res64; -} - - -float32x2_t vadd_f32(float32x2_t a, float32x2_t b); // VADD.F32 d0,d0,d0 -_NEON2SSE_INLINE float32x2_t vadd_f32(float32x2_t a, float32x2_t b) -{ - __m128 res; - __m64_128 res64; - res = _mm_add_ps(_pM128(a),_pM128(b)); //SSE, use only low 64 bits - _M64f(res64, res); - return res64; -} - -uint8x8_t vadd_u8(uint8x8_t a, uint8x8_t b); // VADD.I8 d0,d0,d0 -#define vadd_u8 vadd_s8 - -uint16x4_t vadd_u16(uint16x4_t a, uint16x4_t b); // VADD.I16 d0,d0,d0 -#define vadd_u16 vadd_s16 - -uint32x2_t vadd_u32(uint32x2_t a, uint32x2_t b); // VADD.I32 d0,d0,d0 -#define vadd_u32 vadd_s32 - -uint64x1_t vadd_u64(uint64x1_t a, uint64x1_t b); // VADD.I64 d0,d0,d0 -_NEON2SSE_INLINE uint64x1_t vadd_u64(uint64x1_t a, uint64x1_t b) -{ - uint64x1_t res64; - res64.m64_u64[0] = a.m64_u64[0] + b.m64_u64[0]; - return res64; -} - - -int8x16_t vaddq_s8(int8x16_t a, int8x16_t b); // VADD.I8 q0,q0,q0 -#define vaddq_s8 _mm_add_epi8 - -int16x8_t vaddq_s16(int16x8_t a, int16x8_t b); // VADD.I16 q0,q0,q0 -#define vaddq_s16 _mm_add_epi16 - -int32x4_t vaddq_s32(int32x4_t a, int32x4_t b); // VADD.I32 q0,q0,q0 -#define vaddq_s32 _mm_add_epi32 - -int64x2_t vaddq_s64(int64x2_t a, int64x2_t b); // VADD.I64 q0,q0,q0 -#define vaddq_s64 _mm_add_epi64 - -float32x4_t vaddq_f32(float32x4_t a, float32x4_t b); // VADD.F32 q0,q0,q0 -#define vaddq_f32 _mm_add_ps - -uint8x16_t vaddq_u8(uint8x16_t a, uint8x16_t b); // VADD.I8 q0,q0,q0 -#define vaddq_u8 _mm_add_epi8 - -uint16x8_t vaddq_u16(uint16x8_t a, uint16x8_t b); // VADD.I16 q0,q0,q0 -#define vaddq_u16 _mm_add_epi16 - -uint32x4_t vaddq_u32(uint32x4_t a, uint32x4_t b); // VADD.I32 q0,q0,q0 -#define vaddq_u32 _mm_add_epi32 - -uint64x2_t vaddq_u64(uint64x2_t a, uint64x2_t b); // VADD.I64 q0,q0,q0 -#define vaddq_u64 _mm_add_epi64 - -//**************************** Vector long add *****************************: -//*********************************************************************** -//Va, Vb have equal lane sizes, result is a 128 bit vector of lanes that are twice the width. -int16x8_t vaddl_s8(int8x8_t a, int8x8_t b); // VADDL.S8 q0,d0,d0 -_NEON2SSE_INLINE int16x8_t vaddl_s8(int8x8_t a, int8x8_t b) // VADDL.S8 q0,d0,d0 -{ - __m128i a16, b16; - a16 = _MM_CVTEPI8_EPI16 (_pM128i(a)); //SSE4.1, - b16 = _MM_CVTEPI8_EPI16 (_pM128i(b)); //SSE4.1, - return _mm_add_epi16 (a16, b16); -} - -int32x4_t vaddl_s16(int16x4_t a, int16x4_t b); // VADDL.S16 q0,d0,d0 -_NEON2SSE_INLINE int32x4_t vaddl_s16(int16x4_t a, int16x4_t b) // VADDL.S16 q0,d0,d0 -{ - __m128i a32, b32; - a32 = _MM_CVTEPI16_EPI32 (_pM128i(a)); //SSE4.1 - b32 = _MM_CVTEPI16_EPI32 (_pM128i(b)); //SSE4.1 - return _mm_add_epi32 (a32, b32); -} - -int64x2_t vaddl_s32(int32x2_t a, int32x2_t b); // VADDL.S32 q0,d0,d0 -_NEON2SSE_INLINE int64x2_t vaddl_s32(int32x2_t a, int32x2_t b) // VADDL.S32 q0,d0,d0 -{ - //may be not optimal - __m128i a64, b64; - a64 = _MM_CVTEPI32_EPI64 (_pM128i(a)); //SSE4.1 - b64 = _MM_CVTEPI32_EPI64 (_pM128i(b)); //SSE4.1 - return _mm_add_epi64 ( a64, b64); -} - -uint16x8_t vaddl_u8(uint8x8_t a, uint8x8_t b); // VADDL.U8 q0,d0,d0 -_NEON2SSE_INLINE uint16x8_t vaddl_u8(uint8x8_t a, uint8x8_t b) // VADDL.U8 q0,d0,d0 -{ - __m128i a16, b16; - a16 = _MM_CVTEPU8_EPI16 (_pM128i(a)); //SSE4.1 - b16 = _MM_CVTEPU8_EPI16 (_pM128i(b)); //SSE4.1 - return _mm_add_epi16 (a16, b16); -} - -uint32x4_t vaddl_u16(uint16x4_t a, uint16x4_t b); // VADDL.s16 q0,d0,d0 -_NEON2SSE_INLINE uint32x4_t vaddl_u16(uint16x4_t a, uint16x4_t b) // VADDL.s16 q0,d0,d0 -{ - __m128i a32, b32; - a32 = _MM_CVTEPU16_EPI32 (_pM128i(a)); //SSE4.1 - b32 = _MM_CVTEPU16_EPI32 (_pM128i(b)); //SSE4.1 - return _mm_add_epi32 (a32, b32); -} - -uint64x2_t vaddl_u32(uint32x2_t a, uint32x2_t b); // VADDL.U32 q0,d0,d0 -_NEON2SSE_INLINE uint64x2_t vaddl_u32(uint32x2_t a, uint32x2_t b) // VADDL.U32 q0,d0,d0 -{ - //may be not optimal - __m128i a64, b64; - a64 = _MM_CVTEPU32_EPI64 (_pM128i(a)); //SSE4.1 - b64 = _MM_CVTEPU32_EPI64 (_pM128i(b)); //SSE4.1 - return _mm_add_epi64 (a64, b64); -} - -//*************** Vector wide add: vaddw_<type>. Vr[i]:=Va[i]+Vb[i] ****************** -//*************** ********************************************************************* -int16x8_t vaddw_s8(int16x8_t a, int8x8_t b); // VADDW.S8 q0,q0,d0 -_NEON2SSE_INLINE int16x8_t vaddw_s8(int16x8_t a, int8x8_t b) // VADDW.S8 q0,q0,d0 -{ - __m128i b16; - b16 = _MM_CVTEPI8_EPI16 (_pM128i(b)); //SSE4.1, - return _mm_add_epi16 (a, b16); -} - -int32x4_t vaddw_s16(int32x4_t a, int16x4_t b); // VADDW.S16 q0,q0,d0 -_NEON2SSE_INLINE int32x4_t vaddw_s16(int32x4_t a, int16x4_t b) // VADDW.S16 q0,q0,d0 -{ - __m128i b32; - b32 = _MM_CVTEPI16_EPI32(_pM128i(b)); //SSE4.1, - return _mm_add_epi32 (a, b32); -} - -int64x2_t vaddw_s32(int64x2_t a, int32x2_t b); // VADDW.S32 q0,q0,d0 -_NEON2SSE_INLINE int64x2_t vaddw_s32(int64x2_t a, int32x2_t b) // VADDW.S32 q0,q0,d0 -{ - __m128i b64; - b64 = _MM_CVTEPI32_EPI64 (_pM128i(b)); //SSE4.1 - return _mm_add_epi64 (a, b64); -} - -uint16x8_t vaddw_u8(uint16x8_t a, uint8x8_t b); // VADDW.U8 q0,q0,d0 -_NEON2SSE_INLINE uint16x8_t vaddw_u8(uint16x8_t a, uint8x8_t b) // VADDW.U8 q0,q0,d0 -{ - __m128i b16; - b16 = _MM_CVTEPU8_EPI16 (_pM128i(b)); //SSE4.1 - return _mm_add_epi16 (a, b16); -} - -uint32x4_t vaddw_u16(uint32x4_t a, uint16x4_t b); // VADDW.s16 q0,q0,d0 -_NEON2SSE_INLINE uint32x4_t vaddw_u16(uint32x4_t a, uint16x4_t b) // VADDW.s16 q0,q0,d0 -{ - __m128i b32; - b32 = _MM_CVTEPU16_EPI32 (_pM128i(b)); //SSE4.1 - return _mm_add_epi32 (a, b32); -} - -uint64x2_t vaddw_u32(uint64x2_t a, uint32x2_t b); // VADDW.U32 q0,q0,d0 -_NEON2SSE_INLINE uint64x2_t vaddw_u32(uint64x2_t a, uint32x2_t b) // VADDW.U32 q0,q0,d0 -{ - __m128i b64; - b64 = _MM_CVTEPU32_EPI64 (_pM128i(b)); //SSE4.1 - return _mm_add_epi64 (a, b64); -} - -//******************************Vector halving add: vhadd -> Vr[i]:=(Va[i]+Vb[i])>>1 , result truncated ******************************* -//************************************************************************************************************************* -int8x8_t vhadd_s8(int8x8_t a, int8x8_t b); // VHADD.S8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vhadd_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(vhaddq_u8(_pM128i(a), _pM128i(b))); -} - - -int16x4_t vhadd_s16(int16x4_t a, int16x4_t b); // VHADD.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vhadd_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64( vhaddq_s16(_pM128i(a), _pM128i(b))); -} - - -int32x2_t vhadd_s32(int32x2_t a, int32x2_t b); // VHADD.S32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vhadd_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64( vhaddq_s32(_pM128i(a), _pM128i(b))); -} - - -uint8x8_t vhadd_u8(uint8x8_t a, uint8x8_t b); // VHADD.w d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vhadd_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - return64( vhaddq_u8(_pM128i(a), _pM128i(b))); -} - - -uint16x4_t vhadd_u16(uint16x4_t a, uint16x4_t b); // VHADD.s16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vhadd_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64( vhaddq_u16(_pM128i(a), _pM128i(b))); -} - - -uint32x2_t vhadd_u32(uint32x2_t a, uint32x2_t b); // VHADD.U32 d0,d0,d0 -_NEON2SSE_INLINE uint32x2_t vhadd_u32(uint32x2_t a, uint32x2_t b) -{ - uint32x2_t res64; - return64( vhaddq_u32(_pM128i(a), _pM128i(b))); -} - - -int8x16_t vhaddq_s8(int8x16_t a, int8x16_t b); // VHADD.S8 q0,q0,q0 -_NEON2SSE_INLINE int8x16_t vhaddq_s8(int8x16_t a, int8x16_t b) -{ - //need to avoid internal overflow, will use the (x&y)+((x^y)>>1). - __m128i tmp1, tmp2; - tmp1 = _mm_and_si128(a,b); - tmp2 = _mm_xor_si128(a,b); - tmp2 = vshrq_n_s8(tmp2,1); - return _mm_add_epi8(tmp1,tmp2); -} - -int16x8_t vhaddq_s16(int16x8_t a, int16x8_t b); // VHADD.S1 6 q0,q0,q0 -_NEON2SSE_INLINE int16x8_t vhaddq_s16(int16x8_t a, int16x8_t b) -{ - //need to avoid internal overflow, will use the (x&y)+((x^y)>>1). - __m128i tmp1, tmp2; - tmp1 = _mm_and_si128(a,b); - tmp2 = _mm_xor_si128(a,b); - tmp2 = _mm_srai_epi16(tmp2,1); - return _mm_add_epi16(tmp1,tmp2); -} - -int32x4_t vhaddq_s32(int32x4_t a, int32x4_t b); // VHADD.S32 q0,q0,q0 -_NEON2SSE_INLINE int32x4_t vhaddq_s32(int32x4_t a, int32x4_t b) // VHADD.S32 q0,q0,q0 -{ - //need to avoid internal overflow, will use the (x&y)+((x^y)>>1). - __m128i tmp1, tmp2; - tmp1 = _mm_and_si128(a,b); - tmp2 = _mm_xor_si128(a,b); - tmp2 = _mm_srai_epi32(tmp2,1); - return _mm_add_epi32(tmp1,tmp2); -} - -uint8x16_t vhaddq_u8(uint8x16_t a, uint8x16_t b); // VHADD.U8 q0,q0,q0 -_NEON2SSE_INLINE uint8x16_t vhaddq_u8(uint8x16_t a, uint8x16_t b) // VHADD.U8 q0,q0,q0 -{ - __m128i c1, sum, res; - c1 = _mm_set1_epi8(1); - sum = _mm_avg_epu8(a, b); //result is rounded, need to compensate it - res = _mm_xor_si128(a, b); //for rounding compensation - res = _mm_and_si128(res,c1); //for rounding compensation - return _mm_sub_epi8 (sum, res); //actual rounding compensation -} - -uint16x8_t vhaddq_u16(uint16x8_t a, uint16x8_t b); // VHADD.s16 q0,q0,q0 -_NEON2SSE_INLINE uint16x8_t vhaddq_u16(uint16x8_t a, uint16x8_t b) // VHADD.s16 q0,q0,q0 -{ - __m128i sum, res; - sum = _mm_avg_epu16(a, b); //result is rounded, need to compensate it - res = _mm_xor_si128(a, b); //for rounding compensation - res = _mm_slli_epi16 (res,15); //shift left then back right to - res = _mm_srli_epi16 (res,15); //get 1 or zero - return _mm_sub_epi16 (sum, res); //actual rounding compensation -} - -uint32x4_t vhaddq_u32(uint32x4_t a, uint32x4_t b); // VHADD.U32 q0,q0,q0 -_NEON2SSE_INLINE uint32x4_t vhaddq_u32(uint32x4_t a, uint32x4_t b) // VHADD.U32 q0,q0,q0 -{ - //need to avoid internal overflow, will use the (x&y)+((x^y)>>1). - __m128i tmp1, tmp2; - tmp1 = _mm_and_si128(a,b); - tmp2 = _mm_xor_si128(a,b); - tmp2 = _mm_srli_epi32(tmp2,1); - return _mm_add_epi32(tmp1,tmp2); -} - -//************************Vector rounding halving add: vrhadd{q}_<type>. Vr[i]:=(Va[i]+Vb[i]+1)>>1 *************************** -//***************************************************************************************************************************** -int8x8_t vrhadd_s8(int8x8_t a, int8x8_t b); // VRHADD.S8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vrhadd_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(vrhaddq_s8(_pM128i(a), _pM128i(b))); -} - - -int16x4_t vrhadd_s16(int16x4_t a, int16x4_t b); // VRHADD.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vrhadd_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(vrhaddq_s16(_pM128i(a), _pM128i(b))); -} - - -int32x2_t vrhadd_s32(int32x2_t a, int32x2_t b); // VRHADD.S32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vrhadd_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(vrhaddq_s32(_pM128i(a), _pM128i(b))); -} - - -uint8x8_t vrhadd_u8(uint8x8_t a, uint8x8_t b); // VRHADD.U8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vrhadd_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - return64(_mm_avg_epu8(_pM128i(a),_pM128i(b))); //SSE, result rounding!!! -} - - -uint16x4_t vrhadd_u16(uint16x4_t a, uint16x4_t b); // VRHADD.s16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vrhadd_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64(_mm_avg_epu16(_pM128i(a),_pM128i(b))); //SSE, result rounding!!! -} - - -uint32x2_t vrhadd_u32(uint32x2_t a, uint32x2_t b); // VRHADD.U32 d0,d0,d0 -_NEON2SSE_INLINE uint32x2_t vrhadd_u32(uint32x2_t a, uint32x2_t b) -{ - uint32x2_t res64; - return64(vrhaddq_u32(_pM128i(a), _pM128i(b))); -} - - -int8x16_t vrhaddq_s8(int8x16_t a, int8x16_t b); // VRHADD.S8 q0,q0,q0 -_NEON2SSE_INLINE int8x16_t vrhaddq_s8(int8x16_t a, int8x16_t b) // VRHADD.S8 q0,q0,q0 -{ - //no signed average in x86 SIMD, go to unsigned - __m128i c128, au, bu, sum; - c128 = _mm_set1_epi8(0x80); //-128 - au = _mm_sub_epi8(a, c128); //add 128 - bu = _mm_sub_epi8(b, c128); //add 128 - sum = _mm_avg_epu8(au, bu); - return _mm_add_epi8 (sum, c128); //sub 128 -} - -int16x8_t vrhaddq_s16(int16x8_t a, int16x8_t b); // VRHADD.S16 q0,q0,q0 -_NEON2SSE_INLINE int16x8_t vrhaddq_s16(int16x8_t a, int16x8_t b) // VRHADD.S16 q0,q0,q0 -{ - //no signed average in x86 SIMD, go to unsigned - __m128i cx8000, au, bu, sum; - cx8000 = _mm_set1_epi16(0x8000); // - 32768 - au = _mm_sub_epi16(a, cx8000); //add 32768 - bu = _mm_sub_epi16(b, cx8000); //add 32768 - sum = _mm_avg_epu16(au, bu); - return _mm_add_epi16 (sum, cx8000); //sub 32768 -} - -int32x4_t vrhaddq_s32(int32x4_t a, int32x4_t b); // VRHADD.S32 q0,q0,q0 -_NEON2SSE_INLINE int32x4_t vrhaddq_s32(int32x4_t a, int32x4_t b) -{ - //need to avoid overflow - __m128i a2, b2, res, sum; - a2 = _mm_srai_epi32(a,1); //a2=a/2; - b2 = _mm_srai_epi32(b,1); // b2=b/2; - res = _mm_or_si128(a,b); //for rounding - res = _mm_slli_epi32 (res,31); //shift left then back right to - res = _mm_srli_epi32 (res,31); //get 1 or zero - sum = _mm_add_epi32(a2,b2); - return _mm_add_epi32(sum,res); -} - -uint8x16_t vrhaddq_u8(uint8x16_t a, uint8x16_t b); // VRHADD.U8 q0,q0,q0 -#define vrhaddq_u8 _mm_avg_epu8 //SSE2, results rounded - -uint16x8_t vrhaddq_u16(uint16x8_t a, uint16x8_t b); // VRHADD.s16 q0,q0,q0 -#define vrhaddq_u16 _mm_avg_epu16 //SSE2, results rounded - - -uint32x4_t vrhaddq_u32(uint32x4_t a, uint32x4_t b); // VRHADD.U32 q0,q0,q0 -_NEON2SSE_INLINE uint32x4_t vrhaddq_u32(uint32x4_t a, uint32x4_t b) // VRHADD.U32 q0,q0,q0 -{ - //need to avoid overflow - __m128i a2, b2, res, sum; - a2 = _mm_srli_epi32(a,1); //a2=a/2; - b2 = _mm_srli_epi32(b,1); // b2=b/2; - res = _mm_or_si128(a,b); //for rounding - res = _mm_slli_epi32 (res,31); //shift left then back right to - res = _mm_srli_epi32 (res,31); //get 1 or zero - sum = _mm_add_epi32(a2,b2); - return _mm_add_epi32(sum,res); -} - -//****************** VQADD: Vector saturating add ************************ -//************************************************************************ -int8x8_t vqadd_s8(int8x8_t a, int8x8_t b); // VQADD.S8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vqadd_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(_mm_adds_epi8(_pM128i(a),_pM128i(b))); -} - - -int16x4_t vqadd_s16(int16x4_t a, int16x4_t b); // VQADD.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vqadd_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(_mm_adds_epi16(_pM128i(a),_pM128i(b))); -} - - -int32x2_t vqadd_s32(int32x2_t a, int32x2_t b); // VQADD.S32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vqadd_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(vqaddq_s32(_pM128i(a), _pM128i(b))); -} - - -int64x1_t vqadd_s64(int64x1_t a, int64x1_t b); // VQADD.S64 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x1_t vqadd_s64(int64x1_t a, int64x1_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int64x1_t res; - uint64_t a64, b64; - a64 = a.m64_u64[0]; - b64 = b.m64_u64[0]; - res.m64_u64[0] = a64 + b64; - a64 = (a64 >> 63) + (~_SIGNBIT64); - if ((int64_t)((b64 ^ a64) | ~(res.m64_u64[0] ^ b64))>=0) { - res.m64_u64[0] = a64; - } - return res; -} - -uint8x8_t vqadd_u8(uint8x8_t a, uint8x8_t b); // VQADD.U8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vqadd_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - return64(_mm_adds_epu8(_pM128i(a),_pM128i(b))); -} - - -uint16x4_t vqadd_u16(uint16x4_t a, uint16x4_t b); // VQADD.s16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vqadd_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64(_mm_adds_epu16(_pM128i(a),_pM128i(b))); -} - - -uint32x2_t vqadd_u32(uint32x2_t a, uint32x2_t b); // VQADD.U32 d0,d0,d0 -_NEON2SSE_INLINE uint32x2_t vqadd_u32(uint32x2_t a, uint32x2_t b) -{ - uint32x2_t res64; - return64(vqaddq_u32(_pM128i(a), _pM128i(b))); -} - - -uint64x1_t vqadd_u64(uint64x1_t a, uint64x1_t b); // VQADD.U64 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x1_t vqadd_u64(uint64x1_t a, uint64x1_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - _NEON2SSE_ALIGN_16 uint64_t a64, b64; - uint64x1_t res; - a64 = a.m64_u64[0]; - b64 = b.m64_u64[0]; - res.m64_u64[0] = a64 + b64; - if (res.m64_u64[0] < a64) { - res.m64_u64[0] = ~(uint64_t)0; - } - return res; -} - -int8x16_t vqaddq_s8(int8x16_t a, int8x16_t b); // VQADD.S8 q0,q0,q0 -#define vqaddq_s8 _mm_adds_epi8 - -int16x8_t vqaddq_s16(int16x8_t a, int16x8_t b); // VQADD.S16 q0,q0,q0 -#define vqaddq_s16 _mm_adds_epi16 - -int32x4_t vqaddq_s32(int32x4_t a, int32x4_t b); // VQADD.S32 q0,q0,q0 -_NEON2SSE_INLINE int32x4_t vqaddq_s32(int32x4_t a, int32x4_t b) -{ - //no corresponding x86 SIMD soulution, special tricks are necessary. Overflow happens only if a and b have the same sign and sum has the opposite sign - __m128i c7fffffff, res, res_sat, res_xor_a, b_xor_a_; - c7fffffff = _mm_set1_epi32(0x7fffffff); - res = _mm_add_epi32(a, b); - res_sat = _mm_srli_epi32(a, 31); - res_sat = _mm_add_epi32(res_sat, c7fffffff); - res_xor_a = _mm_xor_si128(res, a); - b_xor_a_ = _mm_xor_si128(b, a); - res_xor_a = _mm_andnot_si128(b_xor_a_, res_xor_a); - res_xor_a = _mm_srai_epi32(res_xor_a,31); //propagate the sigh bit, all ffff if <0 all ones otherwise - res_sat = _mm_and_si128(res_xor_a, res_sat); - res = _mm_andnot_si128(res_xor_a, res); - return _mm_or_si128(res, res_sat); -} - -int64x2_t vqaddq_s64(int64x2_t a, int64x2_t b); // VQADD.S64 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqaddq_s64(int64x2_t a, int64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - _NEON2SSE_ALIGN_16 uint64_t atmp[2], btmp[2], res[2]; - _mm_store_si128((__m128i*)atmp, a); - _mm_store_si128((__m128i*)btmp, b); - res[0] = atmp[0] + btmp[0]; - res[1] = atmp[1] + btmp[1]; - - atmp[0] = (atmp[0] >> 63) + (~_SIGNBIT64); - atmp[1] = (atmp[1] >> 63) + (~_SIGNBIT64); - - if ((int64_t)((btmp[0] ^ atmp[0]) | ~(res[0] ^ btmp[0]))>=0) { - res[0] = atmp[0]; - } - if ((int64_t)((btmp[1] ^ atmp[1]) | ~(res[1] ^ btmp[1]))>=0) { - res[1] = atmp[1]; - } - return _mm_load_si128((__m128i*)res); -} - -uint8x16_t vqaddq_u8(uint8x16_t a, uint8x16_t b); // VQADD.U8 q0,q0,q0 -#define vqaddq_u8 _mm_adds_epu8 - -uint16x8_t vqaddq_u16(uint16x8_t a, uint16x8_t b); // VQADD.s16 q0,q0,q0 -#define vqaddq_u16 _mm_adds_epu16 - -uint32x4_t vqaddq_u32(uint32x4_t a, uint32x4_t b); // VQADD.U32 q0,q0,q0 -_NEON2SSE_INLINE uint32x4_t vqaddq_u32(uint32x4_t a, uint32x4_t b) -{ - __m128i c80000000, cmp, subsum, suba, sum; - c80000000 = _mm_set1_epi32 (0x80000000); - sum = _mm_add_epi32 (a, b); - subsum = _mm_sub_epi32 (sum, c80000000); - suba = _mm_sub_epi32 (a, c80000000); - cmp = _mm_cmpgt_epi32 ( suba, subsum); //no unsigned comparison, need to go to signed - return _mm_or_si128 (sum, cmp); //saturation -} - -uint64x2_t vqaddq_u64(uint64x2_t a, uint64x2_t b); // VQADD.U64 q0,q0,q0 -#ifdef USE_SSE4 - _NEON2SSE_INLINE uint64x2_t vqaddq_u64(uint64x2_t a, uint64x2_t b) - { - __m128i c80000000, sum, cmp, suba, subsum; - c80000000 = _mm_set_epi32 (0x80000000, 0x0, 0x80000000, 0x0); - sum = _mm_add_epi64 (a, b); - subsum = _mm_sub_epi64 (sum, c80000000); - suba = _mm_sub_epi64 (a, c80000000); - cmp = _mm_cmpgt_epi64 ( suba, subsum); //no unsigned comparison, need to go to signed, SSE4.2!!! - return _mm_or_si128 (sum, cmp); //saturation - } -#else - _NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x2_t vqaddq_u64(uint64x2_t a, uint64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) - { - _NEON2SSE_ALIGN_16 uint64_t atmp[2], btmp[2], res[2]; - _mm_store_si128((__m128i*)atmp, a); - _mm_store_si128((__m128i*)btmp, b); - res[0] = atmp[0] + btmp[0]; - res[1] = atmp[1] + btmp[1]; - if (res[0] < atmp[0]) res[0] = ~(uint64_t)0; - if (res[1] < atmp[1]) res[1] = ~(uint64_t)0; - return _mm_load_si128((__m128i*)(res)); - } -#endif - - -//******************* Vector add high half (truncated) ****************** -//************************************************************************ -int8x8_t vaddhn_s16(int16x8_t a, int16x8_t b); // VADDHN.I16 d0,q0,q0 -_NEON2SSE_INLINE int8x8_t vaddhn_s16(int16x8_t a, int16x8_t b) // VADDHN.I16 d0,q0,q0 -{ - int8x8_t res64; - __m128i sum; - sum = _mm_add_epi16 (a, b); - sum = _mm_srai_epi16 (sum, 8); - sum = _mm_packs_epi16 (sum, sum); //use 64 low bits only - return64(sum); -} - -int16x4_t vaddhn_s32(int32x4_t a, int32x4_t b); // VADDHN.I32 d0,q0,q0 -_NEON2SSE_INLINE int16x4_t vaddhn_s32(int32x4_t a, int32x4_t b) // VADDHN.I32 d0,q0,q0 -{ - int16x4_t res64; - __m128i sum; - sum = _mm_add_epi32 (a, b); - sum = _mm_srai_epi32(sum, 16); - sum = _mm_packs_epi32 (sum, sum); //use 64 low bits only - return64(sum); -} - -int32x2_t vaddhn_s64(int64x2_t a, int64x2_t b); // VADDHN.I64 d0,q0,q0 -_NEON2SSE_INLINE int32x2_t vaddhn_s64(int64x2_t a, int64x2_t b) -{ - int32x2_t res64; - __m128i sum; - sum = _mm_add_epi64 (a, b); - sum = _mm_shuffle_epi32(sum, 1 | (3 << 2) | (0 << 4) | (2 << 6)); - return64(sum); -} - -uint8x8_t vaddhn_u16(uint16x8_t a, uint16x8_t b); // VADDHN.I16 d0,q0,q0 -_NEON2SSE_INLINE uint8x8_t vaddhn_u16(uint16x8_t a, uint16x8_t b) // VADDHN.I16 d0,q0,q0 -{ - uint8x8_t res64; - __m128i sum; - sum = _mm_add_epi16 (a, b); - sum = _mm_srli_epi16 (sum, 8); - sum = _mm_packus_epi16 (sum,sum); //use 64 low bits only - return64(sum); -} - -uint16x4_t vaddhn_u32(uint32x4_t a, uint32x4_t b); // VADDHN.I32 d0,q0,q0 -_NEON2SSE_INLINE uint16x4_t vaddhn_u32(uint32x4_t a, uint32x4_t b) // VADDHN.I32 d0,q0,q0 -{ - uint16x4_t res64; - __m128i sum; - sum = _mm_add_epi32 (a, b); - sum = _mm_srli_epi32 (sum, 16); - sum = _MM_PACKUS1_EPI32 (sum); //use 64 low bits only - return64(sum); -} - -uint32x2_t vaddhn_u64(uint64x2_t a, uint64x2_t b); // VADDHN.I64 d0,q0,q0 -#define vaddhn_u64 vaddhn_s64 - -//*********** Vector rounding add high half: vraddhn_<type> ******************. -//*************************************************************************** -int8x8_t vraddhn_s16(int16x8_t a, int16x8_t b); // VRADDHN.I16 d0,q0,q0 -_NEON2SSE_INLINE int8x8_t vraddhn_s16(int16x8_t a, int16x8_t b) // VRADDHN.I16 d0,q0,q0 -{ - int8x8_t res64; - __m128i sum, mask1; - sum = _mm_add_epi16 (a, b); - mask1 = _mm_slli_epi16(sum, 9); //shift left then back right to - mask1 = _mm_srli_epi16(mask1, 15); //get 7-th bit 1 or zero - sum = _mm_srai_epi16 (sum, 8); //get high half - sum = _mm_add_epi16 (sum, mask1); //actual rounding - sum = _mm_packs_epi16 (sum, sum); - return64(sum); -} - -int16x4_t vraddhn_s32(int32x4_t a, int32x4_t b); // VRADDHN.I32 d0,q0,q0 -_NEON2SSE_INLINE int16x4_t vraddhn_s32(int32x4_t a, int32x4_t b) // VRADDHN.I32 d0,q0,q0 -{ - //SIMD may be not optimal, serial may be faster - int16x4_t res64; - __m128i sum, mask1; - sum = _mm_add_epi32 (a, b); - mask1 = _mm_slli_epi32(sum, 17); //shift left then back right to - mask1 = _mm_srli_epi32(mask1,31); //get 15-th bit 1 or zero - sum = _mm_srai_epi32 (sum, 16); //get high half - sum = _mm_add_epi32 (sum, mask1); //actual rounding - sum = _mm_packs_epi32 (sum, sum); - return64(sum); -} - -int32x2_t vraddhn_s64(int64x2_t a, int64x2_t b); // VRADDHN.I64 d0,q0,q0 -_NEON2SSE_INLINE int32x2_t vraddhn_s64(int64x2_t a, int64x2_t b) -{ - //SIMD may be not optimal, serial may be faster - int32x2_t res64; - __m128i sum, mask1; - sum = _mm_add_epi64 (a, b); - mask1 = _mm_slli_epi64(sum, 33); //shift left then back right to - mask1 = _mm_srli_epi64(mask1,32); //get 31-th bit 1 or zero - sum = _mm_add_epi64 (sum, mask1); //actual high half rounding - sum = _mm_shuffle_epi32(sum, 1 | (3 << 2) | (1 << 4) | (3 << 6)); - return64(sum); -} - -uint8x8_t vraddhn_u16(uint16x8_t a, uint16x8_t b); // VRADDHN.I16 d0,q0,q0 -_NEON2SSE_INLINE uint8x8_t vraddhn_u16(uint16x8_t a, uint16x8_t b) // VRADDHN.I16 d0,q0,q0 -{ - uint8x8_t res64; - __m128i sum, mask1; - sum = _mm_add_epi16 (a, b); - mask1 = _mm_slli_epi16(sum, 9); //shift left then back right to - mask1 = _mm_srli_epi16(mask1, 15); //get 7-th bit 1 or zero - sum = _mm_srai_epi16 (sum, 8); //get high half - sum = _mm_add_epi16 (sum, mask1); //actual rounding - sum = _mm_packus_epi16 (sum, sum); - return64(sum); -} - -uint16x4_t vraddhn_u32(uint32x4_t a, uint32x4_t b); // VRADDHN.I32 d0,q0,q0 -_NEON2SSE_INLINE uint16x4_t vraddhn_u32(uint32x4_t a, uint32x4_t b) -{ - //SIMD may be not optimal, serial may be faster - uint16x4_t res64; - __m128i sum, mask1; - sum = _mm_add_epi32 (a, b); - mask1 = _mm_slli_epi32(sum, 17); //shift left then back right to - mask1 = _mm_srli_epi32(mask1,31); //get 15-th bit 1 or zero - sum = _mm_srai_epi32 (sum, 16); //get high half - sum = _mm_add_epi32 (sum, mask1); //actual rounding - sum = _MM_PACKUS1_EPI32 (sum); - return64(sum); -} - -uint32x2_t vraddhn_u64(uint64x2_t a, uint64x2_t b); // VRADDHN.I64 d0,q0,q0 -#define vraddhn_u64 vraddhn_s64 - -//********************************************************************************** -//********* Multiplication ************************************* -//************************************************************************************** - -//Vector multiply: vmul -> Vr[i] := Va[i] * Vb[i] -//As we don't go to wider result functions are equal to "multiply low" in x86 -int8x8_t vmul_s8(int8x8_t a, int8x8_t b); // VMUL.I8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vmul_s8(int8x8_t a, int8x8_t b) // VMUL.I8 d0,d0,d0 -{ - // no 8 bit simd multiply, need to go to 16 bits in SSE - int8x8_t res64; - __m128i a128, b128, res; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - a128 = _MM_CVTEPI8_EPI16 (_pM128i(a)); // SSE 4.1 use low 64 bits - b128 = _MM_CVTEPI8_EPI16 (_pM128i(b)); // SSE 4.1 use low 64 bits - res = _mm_mullo_epi16 (a128, b128); - res = _mm_shuffle_epi8 (res, *(__m128i*) mask8_16_even_odd); //return to 8 bit from 16, use 64 low bits only - return64(res); -} - -int16x4_t vmul_s16(int16x4_t a, int16x4_t b); // VMUL.I16 d0,d0,d0 -#define vmul_s16 vmul_u16 - -int32x2_t vmul_s32(int32x2_t a, int32x2_t b); // VMUL.I32 d0,d0,d0 -#define vmul_s32 vmul_u32 - -float32x2_t vmul_f32(float32x2_t a, float32x2_t b); // VMUL.F32 d0,d0,d0 -_NEON2SSE_INLINE float32x2_t vmul_f32(float32x2_t a, float32x2_t b) -{ - float32x4_t tmp; - __m64_128 res64; - tmp = _mm_mul_ps(_pM128(a),_pM128(b)); - _M64f(res64, tmp); //use low 64 bits - return res64; -} - -uint8x8_t vmul_u8(uint8x8_t a, uint8x8_t b); // VMUL.I8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vmul_u8(uint8x8_t a, uint8x8_t b) // VMUL.I8 d0,d0,d0 -{ - // no 8 bit simd multiply, need to go to 16 bits in SSE - uint8x8_t res64; - __m128i mask, a128, b128, res; - mask = _mm_set1_epi16(0xff); - a128 = _MM_CVTEPU8_EPI16 (_pM128i(a)); - b128 = _MM_CVTEPU8_EPI16 (_pM128i(b)); - res = _mm_mullo_epi16 (a128, b128); - res = _mm_and_si128(res, mask); //to avoid saturation - res = _mm_packus_epi16 (res,res); //use only low 64 bits - return64(res); -} - -uint16x4_t vmul_u16(uint16x4_t a, uint16x4_t b); // VMUL.I16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vmul_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64(_mm_mullo_epi16(_pM128i(a),_pM128i(b))); -} - - -uint32x2_t vmul_u32(uint32x2_t a, uint32x2_t b); // VMUL.I32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING( uint32x2_t vmul_u32(uint32x2_t a, uint32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - uint32x2_t res; - res.m64_u32[0] = a.m64_u32[0] * b.m64_u32[0]; - res.m64_u32[1] = a.m64_u32[1] * b.m64_u32[1]; - return res; -} - -poly8x8_t vmul_p8(poly8x8_t a, poly8x8_t b); // VMUL.P8 d0,d0,d0 -_NEON2SSE_INLINE poly8x8_t vmul_p8(poly8x8_t a, poly8x8_t b) -{ - //may be optimized - poly8x8_t res64; - __m128i a64, b64, c1, res, tmp, bmasked; - int i; - a64 = _pM128i(a); - b64 = _pM128i(b); - c1 = _mm_cmpeq_epi8 (a64,a64); //all ones 0xff.... - c1 = vshrq_n_u8(c1,7); //0x1 - bmasked = _mm_and_si128(b64, c1); //0x1 - res = vmulq_u8(a64, bmasked); - for(i = 1; i<8; i++) { - c1 = _mm_slli_epi16(c1,1); //shift mask left by 1, 16 bit shift is OK here - bmasked = _mm_and_si128(b64, c1); //0x1 - tmp = vmulq_u8(a64, bmasked); - res = _mm_xor_si128(res, tmp); - } - return64 (res); -} - -int8x16_t vmulq_s8(int8x16_t a, int8x16_t b); // VMUL.I8 q0,q0,q0 -_NEON2SSE_INLINE int8x16_t vmulq_s8(int8x16_t a, int8x16_t b) // VMUL.I8 q0,q0,q0 -{ - // no 8 bit simd multiply, need to go to 16 bits - //solution may be not optimal - __m128i a16, b16, r16_1, r16_2; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - a16 = _MM_CVTEPI8_EPI16 (a); // SSE 4.1 - b16 = _MM_CVTEPI8_EPI16 (b); // SSE 4.1 - r16_1 = _mm_mullo_epi16 (a16, b16); - //swap hi and low part of a and b to process the remaining data - a16 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - b16 = _mm_shuffle_epi32 (b, _SWAP_HI_LOW32); - a16 = _MM_CVTEPI8_EPI16 (a16); // SSE 4.1 - b16 = _MM_CVTEPI8_EPI16 (b16); // SSE 4.1 __m128i r16_2 - - r16_2 = _mm_mullo_epi16 (a16, b16); - r16_1 = _mm_shuffle_epi8 (r16_1, *(__m128i*)mask8_16_even_odd); //return to 8 bit - r16_2 = _mm_shuffle_epi8 (r16_2, *(__m128i*)mask8_16_even_odd); //return to 8 bit - - return _mm_unpacklo_epi64(r16_1, r16_2); -} - -int16x8_t vmulq_s16(int16x8_t a, int16x8_t b); // VMUL.I16 q0,q0,q0 -#define vmulq_s16 _mm_mullo_epi16 - -int32x4_t vmulq_s32(int32x4_t a, int32x4_t b); // VMUL.I32 q0,q0,q0 -#define vmulq_s32 _MM_MULLO_EPI32 //SSE4.1 - -float32x4_t vmulq_f32(float32x4_t a, float32x4_t b); // VMUL.F32 q0,q0,q0 -#define vmulq_f32 _mm_mul_ps - -uint8x16_t vmulq_u8(uint8x16_t a, uint8x16_t b); // VMUL.I8 q0,q0,q0 -_NEON2SSE_INLINE uint8x16_t vmulq_u8(uint8x16_t a, uint8x16_t b) // VMUL.I8 q0,q0,q0 -{ - // no 8 bit simd multiply, need to go to 16 bits - //solution may be not optimal - __m128i maskff, a16, b16, r16_1, r16_2; - maskff = _mm_set1_epi16(0xff); - a16 = _MM_CVTEPU8_EPI16 (a); // SSE 4.1 - b16 = _MM_CVTEPU8_EPI16 (b); // SSE 4.1 - r16_1 = _mm_mullo_epi16 (a16, b16); - r16_1 = _mm_and_si128(r16_1, maskff); //to avoid saturation - //swap hi and low part of a and b to process the remaining data - a16 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - b16 = _mm_shuffle_epi32 (b, _SWAP_HI_LOW32); - a16 = _MM_CVTEPI8_EPI16 (a16); // SSE 4.1 - b16 = _MM_CVTEPI8_EPI16 (b16); // SSE 4.1 - - r16_2 = _mm_mullo_epi16 (a16, b16); - r16_2 = _mm_and_si128(r16_2, maskff); //to avoid saturation - return _mm_packus_epi16 (r16_1, r16_2); -} - -uint16x8_t vmulq_u16(uint16x8_t a, uint16x8_t b); // VMUL.I16 q0,q0,q0 -#define vmulq_u16 _mm_mullo_epi16 - -uint32x4_t vmulq_u32(uint32x4_t a, uint32x4_t b); // VMUL.I32 q0,q0,q0 -#define vmulq_u32 _MM_MULLO_EPI32 //SSE4.1 - -poly8x16_t vmulq_p8(poly8x16_t a, poly8x16_t b); // VMUL.P8 q0,q0,q0 -_NEON2SSE_INLINE poly8x16_t vmulq_p8(poly8x16_t a, poly8x16_t b) -{ - //may be optimized - __m128i c1, res, tmp, bmasked; - int i; - c1 = _mm_cmpeq_epi8 (a,a); //all ones 0xff.... - c1 = vshrq_n_u8(c1,7); //0x1 - bmasked = _mm_and_si128(b, c1); //0x1 - res = vmulq_u8(a, bmasked); - for(i = 1; i<8; i++) { - c1 = _mm_slli_epi16(c1,1); //shift mask left by 1, 16 bit shift is OK here - bmasked = _mm_and_si128(b, c1); //0x1 - tmp = vmulq_u8(a, bmasked); - res = _mm_xor_si128(res, tmp); - } - return res; -} - -//************************* Vector long multiply *********************************** -//**************************************************************************** -int16x8_t vmull_s8(int8x8_t a, int8x8_t b); // VMULL.S8 q0,d0,d0 -_NEON2SSE_INLINE int16x8_t vmull_s8(int8x8_t a, int8x8_t b) // VMULL.S8 q0,d0,d0 -{ - //no 8 bit simd multiply, need to go to 16 bits - __m128i a16, b16; - a16 = _MM_CVTEPI8_EPI16 (_pM128i(a)); // SSE 4.1 - b16 = _MM_CVTEPI8_EPI16 (_pM128i(b)); // SSE 4.1 - return _mm_mullo_epi16 (a16, b16); //should fit into 16 bit -} - -int32x4_t vmull_s16(int16x4_t a, int16x4_t b); // VMULL.S16 q0,d0,d0 -_NEON2SSE_INLINE int32x4_t vmull_s16(int16x4_t a, int16x4_t b) // VMULL.S16 q0,d0,d0 -{ - #ifdef USE_SSE4 - __m128i a16, b16; - a16 = _MM_CVTEPI16_EPI32 (_pM128i(a)); // SSE 4.1 - b16 = _MM_CVTEPI16_EPI32 (_pM128i(b)); // SSE 4.1 - return _MM_MULLO_EPI32 (a16, b16); // SSE 4.1 - #else - __m128i low, hi, a128,b128; - a128 = _pM128i(a); - b128 = _pM128i(b); - low = _mm_mullo_epi16(a128,b128); - hi = _mm_mulhi_epi16(a128,b128); - return _mm_unpacklo_epi16(low,hi); - #endif -} - -int64x2_t vmull_s32(int32x2_t a, int32x2_t b); // VMULL.S32 q0,d0,d0 -_NEON2SSE_INLINE int64x2_t vmull_s32(int32x2_t a, int32x2_t b) // VMULL.S32 q0,d0,d0 -{ - __m128i ab, ba, a128, b128; - a128 = _pM128i(a); - b128 = _pM128i(b); - ab = _mm_unpacklo_epi32 (a128, b128); //a0, b0, a1,b1 - ba = _mm_unpacklo_epi32 (b128, a128); //b0, a0, b1,a1 - return _MM_MUL_EPI32(ab, ba); //uses 1rst and 3rd data lanes, the multiplication gives 64 bit result -} - -uint16x8_t vmull_u8(uint8x8_t a, uint8x8_t b); // VMULL.U8 q0,d0,d0 -_NEON2SSE_INLINE uint16x8_t vmull_u8(uint8x8_t a, uint8x8_t b) // VMULL.U8 q0,d0,d0 -{ - //no 8 bit simd multiply, need to go to 16 bits - __m128i a16, b16; - a16 = _MM_CVTEPU8_EPI16 (_pM128i(a)); // SSE 4.1 - b16 = _MM_CVTEPU8_EPI16 (_pM128i(b)); // SSE 4.1 - return _mm_mullo_epi16 (a16, b16); //should fit into 16 bit -} - -uint32x4_t vmull_u16(uint16x4_t a, uint16x4_t b); // VMULL.s16 q0,d0,d0 -_NEON2SSE_INLINE uint32x4_t vmull_u16(uint16x4_t a, uint16x4_t b) // VMULL.s16 q0,d0,d0 -{ - #ifdef USE_SSE4 - __m128i a16, b16; - a16 = _MM_CVTEPU16_EPI32 (_pM128i(a)); // SSE 4.1 - b16 = _MM_CVTEPU16_EPI32 (_pM128i(b)); // SSE 4.1 - return _MM_MULLO_EPI32 (a16, b16); // SSE 4.1 - #else - __m128i a128,b128,low, hi; - a128 = _pM128i(a); - b128 = _pM128i(b); - low = _mm_mullo_epi16(a128,b128); - hi = _mm_mulhi_epu16(a128,b128); - return _mm_unpacklo_epi16(low,hi); - #endif -} - -uint64x2_t vmull_u32(uint32x2_t a, uint32x2_t b); // VMULL.U32 q0,d0,d0 -_NEON2SSE_INLINE uint64x2_t vmull_u32(uint32x2_t a, uint32x2_t b) // VMULL.U32 q0,d0,d0 -{ - ///may be not optimal compared with serial implementation - __m128i ab, ba, a128, b128; - a128 = _pM128i(a); - b128 = _pM128i(b); - ab = _mm_unpacklo_epi32 (a128, b128); //a0, b0, a1,b1 - ba = _mm_unpacklo_epi32 (b128, a128); //b0, a0, b1,a1 - return _mm_mul_epu32 (ab, ba); //uses 1rst and 3rd data lanes, the multiplication gives 64 bit result -} - -poly16x8_t vmull_p8(poly8x8_t a, poly8x8_t b); // VMULL.P8 q0,d0,d0 -_NEON2SSE_INLINE poly16x8_t vmull_p8(poly8x8_t a, poly8x8_t b) -{ - //may be optimized - __m128i a128,b128, c1, a128_16, bmasked_16, res, tmp, bmasked; - int i; - a128 = _pM128i(a); - b128 = _pM128i(b); - c1 = _mm_cmpeq_epi8 (a128,a128); //all ones 0xff.... - c1 = vshrq_n_u8(c1,7); //0x1 - bmasked = _mm_and_si128(b128, c1); //0x1 - - a128_16 = _MM_CVTEPU8_EPI16 (a128); // SSE 4.1 - bmasked_16 = _MM_CVTEPU8_EPI16 (bmasked); // SSE 4.1 - res = _mm_mullo_epi16 (a128_16, bmasked_16); //should fit into 16 bit - for(i = 1; i<8; i++) { - c1 = _mm_slli_epi16(c1,1); //shift mask left by 1, 16 bit shift is OK here - bmasked = _mm_and_si128(b128, c1); //0x1 - bmasked_16 = _MM_CVTEPU8_EPI16 (bmasked); // SSE 4.1 - tmp = _mm_mullo_epi16 (a128_16, bmasked_16); //should fit into 16 bit, vmull_u8(a, bmasked); - res = _mm_xor_si128(res, tmp); - } - return res; -} - -//****************Vector saturating doubling long multiply ************************** -//***************************************************************** -int32x4_t vqdmull_s16(int16x4_t a, int16x4_t b); // VQDMULL.S16 q0,d0,d0 -_NEON2SSE_INLINE int32x4_t vqdmull_s16(int16x4_t a, int16x4_t b) -{ - //the serial soulution may be faster due to saturation - __m128i res; - res = vmull_s16(a, b); - return vqd_s32(res); -} - -int64x2_t vqdmull_s32(int32x2_t a, int32x2_t b); // VQDMULL.S32 q0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqdmull_s32(int32x2_t a, int32x2_t b),_NEON2SSE_REASON_SLOW_SERIAL) -{ - //the serial soulution may be faster due to saturation - __m128i res; - res = vmull_s32(a,b); - return vqaddq_s64(res,res); //slow serial function!!!! -} - -//********************* Vector multiply accumulate: vmla -> Vr[i] := Va[i] + Vb[i] * Vc[i] ************************ -//****************************************************************************************** -int8x8_t vmla_s8(int8x8_t a, int8x8_t b, int8x8_t c); // VMLA.I8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vmla_s8(int8x8_t a, int8x8_t b, int8x8_t c) // VMLA.I8 d0,d0,d0 -{ - // no 8 bit x86 simd multiply, need to go to 16 bits, and use the low 64 bits - int8x8_t res64; - __m128i b128, c128, res; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - b128 = _MM_CVTEPI8_EPI16 (_pM128i(b)); // SSE 4.1 use low 64 bits - c128 = _MM_CVTEPI8_EPI16 (_pM128i(c)); // SSE 4.1 use low 64 bits - res = _mm_mullo_epi16 (c128, b128); - res = _mm_shuffle_epi8 (res, *(__m128i*) mask8_16_even_odd); - res = _mm_add_epi8 (res, _pM128i(a)); //use the low 64 bits - return64(res); -} - -int16x4_t vmla_s16(int16x4_t a, int16x4_t b, int16x4_t c); // VMLA.I16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vmla_s16(int16x4_t a, int16x4_t b, int16x4_t c) -{ - int16x4_t res64; - return64(vmlaq_s16(_pM128i(a),_pM128i(b), _pM128i(c))); -} - - -int32x2_t vmla_s32(int32x2_t a, int32x2_t b, int32x2_t c); // VMLA.I32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vmla_s32(int32x2_t a, int32x2_t b, int32x2_t c) // VMLA.I32 d0,d0,d0 -{ - int32x2_t res64; - __m128i res; - res = _MM_MULLO_EPI32 (_pM128i(b), _pM128i(c)); //SSE4.1 - res = _mm_add_epi32 (res, _pM128i(a)); //use the low 64 bits - return64(res); -} - -float32x2_t vmla_f32(float32x2_t a, float32x2_t b, float32x2_t c); // VMLA.F32 d0,d0,d0 -_NEON2SSE_INLINE float32x2_t vmla_f32(float32x2_t a, float32x2_t b, float32x2_t c) -{ - //fma is coming soon, but right now: - __m128 res; - __m64_128 res64; - res = _mm_mul_ps (_pM128(c), _pM128(b)); - res = _mm_add_ps (_pM128(a), res); - _M64f(res64, res); - return res64; -} - -uint8x8_t vmla_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c); // VMLA.I8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vmla_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c) // VMLA.I8 d0,d0,d0 -{ - // no 8 bit x86 simd multiply, need to go to 16 bits, and use the low 64 bits - uint8x8_t res64; - __m128i mask, b128, c128, res; - mask = _mm_set1_epi16(0xff); - b128 = _MM_CVTEPU8_EPI16 (_pM128i(b)); // SSE 4.1 use low 64 bits - c128 = _MM_CVTEPU8_EPI16 (_pM128i(c)); // SSE 4.1 use low 64 bits - res = _mm_mullo_epi16 (c128, b128); - res = _mm_and_si128(res, mask); //to avoid saturation - res = _mm_packus_epi16 (res, res); - res = _mm_add_epi8 (res, _pM128i(a)); //use the low 64 bits - return64(res); -} - -uint16x4_t vmla_u16(uint16x4_t a, uint16x4_t b, uint16x4_t c); // VMLA.I16 d0,d0,d0 -#define vmla_u16 vmla_s16 - -uint32x2_t vmla_u32(uint32x2_t a, uint32x2_t b, uint32x2_t c); // VMLA.I32 d0,d0,d0 -#define vmla_u32 vmla_s32 - -int8x16_t vmlaq_s8(int8x16_t a, int8x16_t b, int8x16_t c); // VMLA.I8 q0,q0,q0 -_NEON2SSE_INLINE int8x16_t vmlaq_s8(int8x16_t a, int8x16_t b, int8x16_t c) // VMLA.I8 q0,q0,q0 -{ - //solution may be not optimal - // no 8 bit simd multiply, need to go to 16 bits - __m128i b16, c16, r16_1, a_2,r16_2; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - b16 = _MM_CVTEPI8_EPI16 (b); // SSE 4.1 - c16 = _MM_CVTEPI8_EPI16 (c); // SSE 4.1 - r16_1 = _mm_mullo_epi16 (b16, c16); - r16_1 = _mm_shuffle_epi8 (r16_1, *(__m128i*) mask8_16_even_odd); //return to 8 bits - r16_1 = _mm_add_epi8 (r16_1, a); - //swap hi and low part of a, b and c to process the remaining data - a_2 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - b16 = _mm_shuffle_epi32 (b, _SWAP_HI_LOW32); - c16 = _mm_shuffle_epi32 (c, _SWAP_HI_LOW32); - b16 = _MM_CVTEPI8_EPI16 (b16); // SSE 4.1 - c16 = _MM_CVTEPI8_EPI16 (c16); // SSE 4.1 - - r16_2 = _mm_mullo_epi16 (b16, c16); - r16_2 = _mm_shuffle_epi8 (r16_2, *(__m128i*) mask8_16_even_odd); - r16_2 = _mm_add_epi8(r16_2, a_2); - return _mm_unpacklo_epi64(r16_1,r16_2); -} - -int16x8_t vmlaq_s16(int16x8_t a, int16x8_t b, int16x8_t c); // VMLA.I16 q0,q0,q0 -_NEON2SSE_INLINE int16x8_t vmlaq_s16(int16x8_t a, int16x8_t b, int16x8_t c) // VMLA.I16 q0,q0,q0 -{ - __m128i res; - res = _mm_mullo_epi16 (c, b); - return _mm_add_epi16 (res, a); -} - -int32x4_t vmlaq_s32(int32x4_t a, int32x4_t b, int32x4_t c); // VMLA.I32 q0,q0,q0 -_NEON2SSE_INLINE int32x4_t vmlaq_s32(int32x4_t a, int32x4_t b, int32x4_t c) // VMLA.I32 q0,q0,q0 -{ - __m128i res; - res = _MM_MULLO_EPI32 (c, b); //SSE4.1 - return _mm_add_epi32 (res, a); -} - -float32x4_t vmlaq_f32(float32x4_t a, float32x4_t b, float32x4_t c); // VMLA.F32 q0,q0,q0 -_NEON2SSE_INLINE float32x4_t vmlaq_f32(float32x4_t a, float32x4_t b, float32x4_t c) // VMLA.F32 q0,q0,q0 -{ - //fma is coming soon, but right now: - __m128 res; - res = _mm_mul_ps (c, b); - return _mm_add_ps (a, res); -} - -uint8x16_t vmlaq_u8(uint8x16_t a, uint8x16_t b, uint8x16_t c); // VMLA.I8 q0,q0,q0 -_NEON2SSE_INLINE uint8x16_t vmlaq_u8(uint8x16_t a, uint8x16_t b, uint8x16_t c) // VMLA.I8 q0,q0,q0 -{ - //solution may be not optimal - // no 8 bit simd multiply, need to go to 16 bits - __m128i b16, c16, r16_1, a_2, r16_2; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - b16 = _MM_CVTEPU8_EPI16 (b); // SSE 4.1 - c16 = _MM_CVTEPU8_EPI16 (c); // SSE 4.1 - r16_1 = _mm_mullo_epi16 (b16, c16); - r16_1 = _mm_shuffle_epi8 (r16_1, *(__m128i*) mask8_16_even_odd); //return to 8 bits - r16_1 = _mm_add_epi8 (r16_1, a); - //swap hi and low part of a, b and c to process the remaining data - a_2 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - b16 = _mm_shuffle_epi32 (b, _SWAP_HI_LOW32); - c16 = _mm_shuffle_epi32 (c, _SWAP_HI_LOW32); - b16 = _MM_CVTEPU8_EPI16 (b16); // SSE 4.1 - c16 = _MM_CVTEPU8_EPI16 (c16); // SSE 4.1 - - r16_2 = _mm_mullo_epi16 (b16, c16); - r16_2 = _mm_shuffle_epi8 (r16_2, *(__m128i*) mask8_16_even_odd); - r16_2 = _mm_add_epi8(r16_2, a_2); - return _mm_unpacklo_epi64(r16_1,r16_2); -} - -uint16x8_t vmlaq_u16(uint16x8_t a, uint16x8_t b, uint16x8_t c); // VMLA.I16 q0,q0,q0 -#define vmlaq_u16 vmlaq_s16 - -uint32x4_t vmlaq_u32(uint32x4_t a, uint32x4_t b, uint32x4_t c); // VMLA.I32 q0,q0,q0 -#define vmlaq_u32 vmlaq_s32 - -//********************** Vector widening multiply accumulate (long multiply accumulate): -// vmla -> Vr[i] := Va[i] + Vb[i] * Vc[i] ************** -//******************************************************************************************** -int16x8_t vmlal_s8(int16x8_t a, int8x8_t b, int8x8_t c); // VMLAL.S8 q0,d0,d0 -_NEON2SSE_INLINE int16x8_t vmlal_s8(int16x8_t a, int8x8_t b, int8x8_t c) // VMLAL.S8 q0,d0,d0 -{ - int16x8_t res; - res = vmull_s8(b, c); - return _mm_add_epi16 (res, a); -} - -int32x4_t vmlal_s16(int32x4_t a, int16x4_t b, int16x4_t c); // VMLAL.S16 q0,d0,d0 -_NEON2SSE_INLINE int32x4_t vmlal_s16(int32x4_t a, int16x4_t b, int16x4_t c) // VMLAL.S16 q0,d0,d0 -{ - //may be not optimal compared with serial implementation - int32x4_t res; - res = vmull_s16(b, c); - return _mm_add_epi32 (res, a); -} - -int64x2_t vmlal_s32(int64x2_t a, int32x2_t b, int32x2_t c); // VMLAL.S32 q0,d0,d0 -_NEON2SSE_INLINE int64x2_t vmlal_s32(int64x2_t a, int32x2_t b, int32x2_t c) // VMLAL.S32 q0,d0,d0 -{ - //may be not optimal compared with serial implementation - int64x2_t res; - res = vmull_s32( b, c); - return _mm_add_epi64 (res, a); -} - -uint16x8_t vmlal_u8(uint16x8_t a, uint8x8_t b, uint8x8_t c); // VMLAL.U8 q0,d0,d0 -_NEON2SSE_INLINE uint16x8_t vmlal_u8(uint16x8_t a, uint8x8_t b, uint8x8_t c) // VMLAL.U8 q0,d0,d0 -{ - uint16x8_t res; - res = vmull_u8(b, c); - return _mm_add_epi16 (res, a); -} - -uint32x4_t vmlal_u16(uint32x4_t a, uint16x4_t b, uint16x4_t c); // VMLAL.s16 q0,d0,d0 -_NEON2SSE_INLINE uint32x4_t vmlal_u16(uint32x4_t a, uint16x4_t b, uint16x4_t c) // VMLAL.s16 q0,d0,d0 -{ - //may be not optimal compared with serial implementation - uint32x4_t res; - res = vmull_u16(b, c); - return _mm_add_epi32 (res, a); -} - -uint64x2_t vmlal_u32(uint64x2_t a, uint32x2_t b, uint32x2_t c); // VMLAL.U32 q0,d0,d0 -_NEON2SSE_INLINE uint64x2_t vmlal_u32(uint64x2_t a, uint32x2_t b, uint32x2_t c) // VMLAL.U32 q0,d0,d0 -{ - //may be not optimal compared with serial implementation - int64x2_t res; - res = vmull_u32( b,c); - return _mm_add_epi64 (res, a); -} - -//******************** Vector multiply subtract: vmls -> Vr[i] := Va[i] - Vb[i] * Vc[i] *************************************** -//******************************************************************************************** -int8x8_t vmls_s8(int8x8_t a, int8x8_t b, int8x8_t c); // VMLS.I8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vmls_s8(int8x8_t a, int8x8_t b, int8x8_t c) // VMLS.I8 d0,d0,d0 -{ - // no 8 bit simd multiply, need to go to 16 bits - and use the low 64 bits - int8x8_t res64; - __m128i res; - res64 = vmul_s8(b,c); - res = _mm_sub_epi8 (_pM128i(a), _pM128i(res64)); - return64(res); -} - -int16x4_t vmls_s16(int16x4_t a, int16x4_t b, int16x4_t c); // VMLS.I16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vmls_s16(int16x4_t a, int16x4_t b, int16x4_t c) -{ - int16x4_t res64; - return64(vmlsq_s16(_pM128i(a),_pM128i(b), _pM128i(c))); -} - - -int32x2_t vmls_s32(int32x2_t a, int32x2_t b, int32x2_t c); // VMLS.I32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vmls_s32(int32x2_t a, int32x2_t b, int32x2_t c) // VMLS.I32 d0,d0,d0 -{ - int32x2_t res64; - __m128i res; - res = _MM_MULLO_EPI32 (_pM128i(c),_pM128i( b)); //SSE4.1 - res = _mm_sub_epi32 (_pM128i(a),res); //use low 64 bits only - return64(res); -} - -float32x2_t vmls_f32(float32x2_t a, float32x2_t b, float32x2_t c); // VMLS.F32 d0,d0,d0 -_NEON2SSE_INLINE float32x2_t vmls_f32(float32x2_t a, float32x2_t b, float32x2_t c) -{ - __m128 res; - __m64_128 res64; - res = _mm_mul_ps (_pM128(c), _pM128(b)); - res = _mm_sub_ps (_pM128(a), res); - _M64f(res64, res); - return res64; -} - -uint8x8_t vmls_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c); // VMLS.I8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vmls_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c) -{ - // no 8 bit simd multiply, need to go to 16 bits - and use the low 64 bits - uint8x8_t res64; - __m128i res; - res64 = vmul_u8(b,c); - res = _mm_sub_epi8 (_pM128i(a), _pM128i(res64)); - return64(res); -} - -uint16x4_t vmls_u16(uint16x4_t a, uint16x4_t b, uint16x4_t c); // VMLS.I16 d0,d0,d0 -#define vmls_u16 vmls_s16 - -uint32x2_t vmls_u32(uint32x2_t a, uint32x2_t b, uint32x2_t c); // VMLS.I32 d0,d0,d0 -#define vmls_u32 vmls_s32 - - -int8x16_t vmlsq_s8(int8x16_t a, int8x16_t b, int8x16_t c); // VMLS.I8 q0,q0,q0 -_NEON2SSE_INLINE int8x16_t vmlsq_s8(int8x16_t a, int8x16_t b, int8x16_t c) // VMLS.I8 q0,q0,q0 -{ - //solution may be not optimal - // no 8 bit simd multiply, need to go to 16 bits - __m128i b16, c16, r16_1, a_2, r16_2; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - b16 = _MM_CVTEPI8_EPI16 (b); // SSE 4.1 - c16 = _MM_CVTEPI8_EPI16 (c); // SSE 4.1 - r16_1 = _mm_mullo_epi16 (b16, c16); - r16_1 = _mm_shuffle_epi8 (r16_1, *(__m128i*) mask8_16_even_odd); - r16_1 = _mm_sub_epi8 (a, r16_1); - //swap hi and low part of a, b, c to process the remaining data - a_2 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - b16 = _mm_shuffle_epi32 (b, _SWAP_HI_LOW32); - c16 = _mm_shuffle_epi32 (c, _SWAP_HI_LOW32); - b16 = _MM_CVTEPI8_EPI16 (b16); // SSE 4.1 - c16 = _MM_CVTEPI8_EPI16 (c16); // SSE 4.1 - - r16_2 = _mm_mullo_epi16 (b16, c16); - r16_2 = _mm_shuffle_epi8 (r16_2, *(__m128i*) mask8_16_even_odd); - r16_2 = _mm_sub_epi8 (a_2, r16_2); - return _mm_unpacklo_epi64(r16_1,r16_2); -} - -int16x8_t vmlsq_s16(int16x8_t a, int16x8_t b, int16x8_t c); // VMLS.I16 q0,q0,q0 -_NEON2SSE_INLINE int16x8_t vmlsq_s16(int16x8_t a, int16x8_t b, int16x8_t c) // VMLS.I16 q0,q0,q0 -{ - __m128i res; - res = _mm_mullo_epi16 (c, b); - return _mm_sub_epi16 (a, res); -} - -int32x4_t vmlsq_s32(int32x4_t a, int32x4_t b, int32x4_t c); // VMLS.I32 q0,q0,q0 -_NEON2SSE_INLINE int32x4_t vmlsq_s32(int32x4_t a, int32x4_t b, int32x4_t c) // VMLS.I32 q0,q0,q0 -{ - __m128i res; - res = _MM_MULLO_EPI32 (c, b); //SSE4.1 - return _mm_sub_epi32 (a, res); -} - -float32x4_t vmlsq_f32(float32x4_t a, float32x4_t b, float32x4_t c); // VMLS.F32 q0,q0,q0 -_NEON2SSE_INLINE float32x4_t vmlsq_f32(float32x4_t a, float32x4_t b, float32x4_t c) // VMLS.F32 q0,q0,q0 -{ - __m128 res; - res = _mm_mul_ps (c, b); - return _mm_sub_ps (a, res); -} - -uint8x16_t vmlsq_u8(uint8x16_t a, uint8x16_t b, uint8x16_t c); // VMLS.I8 q0,q0,q0 -_NEON2SSE_INLINE uint8x16_t vmlsq_u8(uint8x16_t a, uint8x16_t b, uint8x16_t c) // VMLS.I8 q0,q0,q0 -{ - //solution may be not optimal - // no 8 bit simd multiply, need to go to 16 bits - __m128i b16, c16, r16_1, a_2, r16_2; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - b16 = _MM_CVTEPU8_EPI16 (b); // SSE 4.1 - c16 = _MM_CVTEPU8_EPI16 (c); // SSE 4.1 - r16_1 = _mm_mullo_epi16 (b16, c16); - r16_1 = _mm_shuffle_epi8 (r16_1, *(__m128i*) mask8_16_even_odd); //return to 8 bits - r16_1 = _mm_sub_epi8 (a, r16_1); - //swap hi and low part of a, b and c to process the remaining data - a_2 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - b16 = _mm_shuffle_epi32 (b, _SWAP_HI_LOW32); - c16 = _mm_shuffle_epi32 (c, _SWAP_HI_LOW32); - b16 = _MM_CVTEPU8_EPI16 (b16); // SSE 4.1 - c16 = _MM_CVTEPU8_EPI16 (c16); // SSE 4.1 - - r16_2 = _mm_mullo_epi16 (b16, c16); - r16_2 = _mm_shuffle_epi8 (r16_2, *(__m128i*) mask8_16_even_odd); - r16_2 = _mm_sub_epi8(a_2, r16_2); - return _mm_unpacklo_epi64(r16_1,r16_2); -} - -uint16x8_t vmlsq_u16(uint16x8_t a, uint16x8_t b, uint16x8_t c); // VMLS.I16 q0,q0,q0 -#define vmlsq_u16 vmlsq_s16 - -uint32x4_t vmlsq_u32(uint32x4_t a, uint32x4_t b, uint32x4_t c); // VMLS.I32 q0,q0,q0 -#define vmlsq_u32 vmlsq_s32 - -//******************** Vector multiply subtract long (widening multiply subtract) ************************************ -//************************************************************************************************************* -int16x8_t vmlsl_s8(int16x8_t a, int8x8_t b, int8x8_t c); // VMLSL.S8 q0,d0,d0 -_NEON2SSE_INLINE int16x8_t vmlsl_s8(int16x8_t a, int8x8_t b, int8x8_t c) // VMLSL.S8 q0,d0,d0 -{ - int16x8_t res; - res = vmull_s8(b, c); - return _mm_sub_epi16 (a, res); -} - -int32x4_t vmlsl_s16(int32x4_t a, int16x4_t b, int16x4_t c); // VMLSL.S16 q0,d0,d0 -_NEON2SSE_INLINE int32x4_t vmlsl_s16(int32x4_t a, int16x4_t b, int16x4_t c) // VMLSL.S16 q0,d0,d0 -{ - //may be not optimal compared with serial implementation - int32x4_t res; - res = vmull_s16(b, c); - return _mm_sub_epi32 (a, res); -} - -int64x2_t vmlsl_s32(int64x2_t a, int32x2_t b, int32x2_t c); // VMLSL.S32 q0,d0,d0 -_NEON2SSE_INLINE int64x2_t vmlsl_s32(int64x2_t a, int32x2_t b, int32x2_t c) // VMLSL.S32 q0,d0,d0 -{ - //may be not optimal compared with serial implementation - int64x2_t res; - res = vmull_s32( b,c); - return _mm_sub_epi64 (a, res); -} - -uint16x8_t vmlsl_u8(uint16x8_t a, uint8x8_t b, uint8x8_t c); // VMLSL.U8 q0,d0,d0 -_NEON2SSE_INLINE uint16x8_t vmlsl_u8(uint16x8_t a, uint8x8_t b, uint8x8_t c) // VMLSL.U8 q0,d0,d0 -{ - uint16x8_t res; - res = vmull_u8(b, c); - return _mm_sub_epi16 (a, res); -} - -uint32x4_t vmlsl_u16(uint32x4_t a, uint16x4_t b, uint16x4_t c); // VMLSL.s16 q0,d0,d0 -_NEON2SSE_INLINE uint32x4_t vmlsl_u16(uint32x4_t a, uint16x4_t b, uint16x4_t c) // VMLSL.s16 q0,d0,d0 -{ - //may be not optimal compared with serial implementation - uint32x4_t res; - res = vmull_u16(b, c); - return _mm_sub_epi32 (a, res); -} - -uint64x2_t vmlsl_u32(uint64x2_t a, uint32x2_t b, uint32x2_t c); // VMLSL.U32 q0,d0,d0 -_NEON2SSE_INLINE uint64x2_t vmlsl_u32(uint64x2_t a, uint32x2_t b, uint32x2_t c) // VMLSL.U32 q0,d0,d0 -{ - //may be not optimal compared with serial implementation - int64x2_t res; - res = vmull_u32( b,c); - return _mm_sub_epi64 (a, res); -} - -//****** Vector saturating doubling multiply high ********************** -//************************************************************************* -int16x4_t vqdmulh_s16(int16x4_t a, int16x4_t b); // VQDMULH.S16 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int16x4_t vqdmulh_s16(int16x4_t a, int16x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int16x4_t res; - int32_t a32, b32, i; - for (i = 0; i<4; i++) { - a32 = (int32_t) a.m64_i16[i]; - b32 = (int32_t) b.m64_i16[i]; - a32 = (a32 * b32) >> 15; - res.m64_i16[i] = (a32 == 0x8000) ? 0x7fff : (int16_t) a32; - } - return res; -} - -int32x2_t vqdmulh_s32(int32x2_t a, int32x2_t b); // VQDMULH.S32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vqdmulh_s32(int32x2_t a, int32x2_t b) // no multiply high 32 bit SIMD in IA32, so need to do some tricks, serial solution may be faster -{ - //may be not optimal compared with a serial solution - int32x2_t res64; - __m128i mask; - _NEON2SSE_ALIGN_16 uint32_t cmask32[] = {0x80000000, 0x80000000, 0x80000000, 0x80000000}; - int64x2_t mul; - mul = vmull_s32(a,b); - mul = _mm_slli_epi64(mul,1); //double the result - //at this point start treating 2 64-bit numbers as 4 32-bit - mul = _mm_shuffle_epi32 (mul, 1 | (3 << 2) | (0 << 4) | (2 << 6)); //shuffle the data to get 2 32-bits - mask = _mm_cmpeq_epi32 (mul, *(__m128i*)cmask32); - mul = _mm_xor_si128 (mul, mask); //res saturated for 0x80000000 - return64(mul); -} - -int16x8_t vqdmulhq_s16(int16x8_t a, int16x8_t b); // VQDMULH.S16 q0,q0,q0 -_NEON2SSE_INLINE int16x8_t vqdmulhq_s16(int16x8_t a, int16x8_t b) // VQDMULH.S16 q0,q0,q0 -{ - __m128i res, res_lo, mask; - _NEON2SSE_ALIGN_16 uint16_t cmask[] = {0x8000, 0x8000, 0x8000, 0x8000, 0x8000, 0x8000, 0x8000, 0x8000}; - res = _mm_mulhi_epi16 (a, b); - res = _mm_slli_epi16 (res, 1); //double the result, don't care about saturation - res_lo = _mm_mullo_epi16 (a, b); - res_lo = _mm_srli_epi16(res_lo,15); //take the highest bit - res = _mm_add_epi16(res, res_lo); //combine results - mask = _mm_cmpeq_epi16 (res, *(__m128i*)cmask); - return _mm_xor_si128 (res, mask); //res saturated for 0x8000 -} - -int32x4_t vqdmulhq_s32(int32x4_t a, int32x4_t b); // VQDMULH.S32 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x4_t vqdmulhq_s32(int32x4_t a, int32x4_t b), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - // no multiply high 32 bit SIMD in IA32, may be not optimal compared with a serial solution for the SSSE3 target - __m128i ab, ba, mask, mul, mul1; - _NEON2SSE_ALIGN_16 uint32_t cmask32[] = {0x80000000, 0x80000000, 0x80000000, 0x80000000}; - ab = _mm_unpacklo_epi32 (a, b); //a0, b0, a1,b1 - ba = _mm_unpacklo_epi32 (b, a); //b0, a0, b1,a1 - mul = _MM_MUL_EPI32(ab, ba); //uses 1rst and 3rd data lanes, the multiplication gives 64 bit result - mul = _mm_slli_epi64(mul,1); //double the result - ab = _mm_unpackhi_epi32 (a, b); //a2, b2, a3,b3 - ba = _mm_unpackhi_epi32 (b, a); //b2, a2, b3,a3 - mul1 = _MM_MUL_EPI32(ab, ba); //uses 1rst and 3rd data lanes, the multiplication gives 64 bit result - mul1 = _mm_slli_epi64(mul1,1); //double the result - mul = _mm_shuffle_epi32 (mul, 1 | (3 << 2) | (0 << 4) | (2 << 6)); //shuffle the data to get 2 32-bits - mul1 = _mm_shuffle_epi32 (mul1, 1 | (3 << 2) | (0 << 4) | (2 << 6)); //shuffle the data to get 2 32-bits - mul = _mm_unpacklo_epi64(mul, mul1); - mask = _mm_cmpeq_epi32 (mul, *(__m128i*)cmask32); - return _mm_xor_si128 (mul, mask); //res saturated for 0x80000000 -} - -//********* Vector saturating rounding doubling multiply high **************** -//**************************************************************************** -//If use _mm_mulhrs_xx functions the result may differ from NEON one a little due to different rounding rules and order -int16x4_t vqrdmulh_s16(int16x4_t a, int16x4_t b); // VQRDMULH.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vqrdmulh_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(vqrdmulhq_s16(_pM128i(a), _pM128i(b))); -} - -int32x2_t vqrdmulh_s32(int32x2_t a, int32x2_t b); // VQRDMULH.S32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vqrdmulh_s32(int32x2_t a, int32x2_t b), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - //may be not optimal compared with a serial solution - int32x2_t res64; - _NEON2SSE_ALIGN_16 uint32_t cmask32[] = {0x80000000, 0x80000000, 0x80000000, 0x80000000}; - __m128i res_sat, mask, mask1; - int64x2_t mul; - mul = vmull_s32(a,b); - res_sat = _mm_slli_epi64 (mul, 1); //double the result, saturation not considered - mask1 = _mm_slli_epi64(res_sat, 32); //shift left then back right to - mask1 = _mm_srli_epi64(mask1,31); //get 31-th bit 1 or zero - mul = _mm_add_epi32 (res_sat, mask1); //actual rounding - //at this point start treating 2 64-bit numbers as 4 32-bit - mul = _mm_shuffle_epi32 (mul, 1 | (3 << 2) | (0 << 4) | (2 << 6)); //shuffle the data to get 2 32-bits from each 64-bit - mask = _mm_cmpeq_epi32 (mul, *(__m128i*)cmask32); - mul = _mm_xor_si128 (mul, mask); //res saturated for 0x80000000 - return64(mul); -} - -int16x8_t vqrdmulhq_s16(int16x8_t a, int16x8_t b); // VQRDMULH.S16 q0,q0,q0 -_NEON2SSE_INLINE int16x8_t vqrdmulhq_s16(int16x8_t a, int16x8_t b) // VQRDMULH.S16 q0,q0,q0 -{ - __m128i mask, res; - _NEON2SSE_ALIGN_16 uint16_t cmask[] = {0x8000, 0x8000, 0x8000, 0x8000, 0x8000, 0x8000, 0x8000, 0x8000}; - res = _mm_mulhrs_epi16 (a, b); - mask = _mm_cmpeq_epi16 (res, *(__m128i*)cmask); - return _mm_xor_si128 (res, mask); //res saturated for 0x8000 -} - -int32x4_t vqrdmulhq_s32(int32x4_t a, int32x4_t b); // VQRDMULH.S32 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x4_t vqrdmulhq_s32(int32x4_t a, int32x4_t b), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - // no multiply high 32 bit SIMD in IA32, may be not optimal compared with a serial solution for the SSSE3 target - __m128i ab, ba, mask, mul, mul1, mask1; - _NEON2SSE_ALIGN_16 uint32_t cmask32[] = {0x80000000, 0x80000000, 0x80000000, 0x80000000}; - ab = _mm_unpacklo_epi32 (a, b); //a0, b0, a1,b1 - ba = _mm_unpacklo_epi32 (b, a); //b0, a0, b1,a1 - mul = _MM_MUL_EPI32(ab, ba); //uses 1rst and 3rd data lanes, the multiplication gives 64 bit result - mul = _mm_slli_epi64 (mul, 1); //double the result, saturation not considered - mask1 = _mm_slli_epi64(mul, 32); //shift left then back right to - mask1 = _mm_srli_epi64(mask1,31); //get 31-th bit 1 or zero - mul = _mm_add_epi32 (mul, mask1); //actual rounding - - ab = _mm_unpackhi_epi32 (a, b); //a2, b2, a3,b3 - ba = _mm_unpackhi_epi32 (b, a); //b2, a2, b3,a3 - mul1 = _MM_MUL_EPI32(ab, ba); //uses 1rst and 3rd data lanes, the multiplication gives 64 bit result - mul1 = _mm_slli_epi64 (mul1, 1); //double the result, saturation not considered - mask1 = _mm_slli_epi64(mul1, 32); //shift left then back right to - mask1 = _mm_srli_epi64(mask1,31); //get 31-th bit 1 or zero - mul1 = _mm_add_epi32 (mul1, mask1); //actual rounding - //at this point start treating 2 64-bit numbers as 4 32-bit - mul = _mm_shuffle_epi32 (mul, 1 | (3 << 2) | (0 << 4) | (2 << 6)); //shuffle the data to get 2 32-bits from each 64-bit - mul1 = _mm_shuffle_epi32 (mul1, 1 | (3 << 2) | (0 << 4) | (2 << 6)); //shuffle the data to get 2 32-bits from each 64-bit - mul = _mm_unpacklo_epi64(mul, mul1); - mask = _mm_cmpeq_epi32 (mul, *(__m128i*)cmask32); - return _mm_xor_si128 (mul, mask); //res saturated for 0x80000000 -} - -//*************Vector widening saturating doubling multiply accumulate (long saturating doubling multiply accumulate) ***** -//************************************************************************************************************************* -int32x4_t vqdmlal_s16(int32x4_t a, int16x4_t b, int16x4_t c); // VQDMLAL.S16 q0,d0,d0 -_NEON2SSE_INLINE int32x4_t vqdmlal_s16(int32x4_t a, int16x4_t b, int16x4_t c) // VQDMLAL.S16 q0,d0,d0 -{ - //not optimal SIMD soulution, serial may be faster - __m128i res32; - res32 = vmull_s16(b, c); - res32 = vqd_s32(res32); //doubling & saturation ,if no saturation we could use _mm_slli_epi32 (res, 1); - return vqaddq_s32(res32, a); //saturation -} - -int64x2_t vqdmlal_s32(int64x2_t a, int32x2_t b, int32x2_t c); // VQDMLAL.S32 q0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqdmlal_s32(int64x2_t a, int32x2_t b, int32x2_t c),_NEON2SSE_REASON_SLOW_SERIAL) -{ - __m128i res64; - res64 = vmull_s32(b,c); - res64 = vqaddq_s64(res64, res64); //doubling & saturation ,if no saturation we could use _mm_slli_epi64 (res, 1); - return vqaddq_s64(res64, a); //saturation -} - -//************************************************************************************ -//****************** Vector subtract *********************************************** -//************************************************************************************ -int8x8_t vsub_s8(int8x8_t a, int8x8_t b); // VSUB.I8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vsub_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(_mm_sub_epi8(_pM128i(a),_pM128i(b))); -} - - -int16x4_t vsub_s16(int16x4_t a, int16x4_t b); // VSUB.I16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vsub_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(_mm_sub_epi16(_pM128i(a),_pM128i(b))); -} - - -int32x2_t vsub_s32(int32x2_t a, int32x2_t b); // VSUB.I32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vsub_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(_mm_sub_epi32(_pM128i(a),_pM128i(b))); -} - - -int64x1_t vsub_s64(int64x1_t a, int64x1_t b); // VSUB.I64 d0,d0,d0 -_NEON2SSE_INLINE int64x1_t vsub_s64(int64x1_t a, int64x1_t b) -{ - int64x1_t res64; - res64.m64_i64[0] = a.m64_i64[0] - b.m64_i64[0]; - return res64; -} - - -float32x2_t vsub_f32(float32x2_t a, float32x2_t b); // VSUB.F32 d0,d0,d0 -_NEON2SSE_INLINE float32x2_t vsub_f32(float32x2_t a, float32x2_t b) -{ - float32x2_t res; - res.m64_f32[0] = a.m64_f32[0] - b.m64_f32[0]; - res.m64_f32[1] = a.m64_f32[1] - b.m64_f32[1]; - return res; -} - -uint8x8_t vsub_u8(uint8x8_t a, uint8x8_t b); // VSUB.I8 d0,d0,d0 -#define vsub_u8 vsub_s8 - -uint16x4_t vsub_u16(uint16x4_t a, uint16x4_t b); // VSUB.I16 d0,d0,d0 -#define vsub_u16 vsub_s16 - -uint32x2_t vsub_u32(uint32x2_t a, uint32x2_t b); // VSUB.I32 d0,d0,d0 -#define vsub_u32 vsub_s32 - - -uint64x1_t vsub_u64(uint64x1_t a, uint64x1_t b); // VSUB.I64 d0,d0,d0 -_NEON2SSE_INLINE uint64x1_t vsub_u64(uint64x1_t a, uint64x1_t b) -{ - int64x1_t res64; - res64.m64_u64[0] = a.m64_u64[0] - b.m64_u64[0]; - return res64; -} - - -int8x16_t vsubq_s8(int8x16_t a, int8x16_t b); // VSUB.I8 q0,q0,q0 -#define vsubq_s8 _mm_sub_epi8 - -int16x8_t vsubq_s16(int16x8_t a, int16x8_t b); // VSUB.I16 q0,q0,q0 -#define vsubq_s16 _mm_sub_epi16 - -int32x4_t vsubq_s32(int32x4_t a, int32x4_t b); // VSUB.I32 q0,q0,q0 -#define vsubq_s32 _mm_sub_epi32 - -int64x2_t vsubq_s64(int64x2_t a, int64x2_t b); // VSUB.I64 q0,q0,q0 -#define vsubq_s64 _mm_sub_epi64 - -float32x4_t vsubq_f32(float32x4_t a, float32x4_t b); // VSUB.F32 q0,q0,q0 -#define vsubq_f32 _mm_sub_ps - -uint8x16_t vsubq_u8(uint8x16_t a, uint8x16_t b); // VSUB.I8 q0,q0,q0 -#define vsubq_u8 _mm_sub_epi8 - -uint16x8_t vsubq_u16(uint16x8_t a, uint16x8_t b); // VSUB.I16 q0,q0,q0 -#define vsubq_u16 _mm_sub_epi16 - -uint32x4_t vsubq_u32(uint32x4_t a, uint32x4_t b); // VSUB.I32 q0,q0,q0 -#define vsubq_u32 _mm_sub_epi32 - -uint64x2_t vsubq_u64(uint64x2_t a, uint64x2_t b); // VSUB.I64 q0,q0,q0 -#define vsubq_u64 _mm_sub_epi64 - -//***************Vector long subtract: vsub -> Vr[i]:=Va[i]-Vb[i] ****************** -//*********************************************************************************** -//Va, Vb have equal lane sizes, result is a 128 bit vector of lanes that are twice the width. -int16x8_t vsubl_s8(int8x8_t a, int8x8_t b); // VSUBL.S8 q0,d0,d0 -_NEON2SSE_INLINE int16x8_t vsubl_s8(int8x8_t a, int8x8_t b) // VSUBL.S8 q0,d0,d0 -{ - __m128i a16, b16; - a16 = _MM_CVTEPI8_EPI16 (_pM128i(a)); //SSE4.1, - b16 = _MM_CVTEPI8_EPI16 (_pM128i(b)); //SSE4.1, - return _mm_sub_epi16 (a16, b16); -} - -int32x4_t vsubl_s16(int16x4_t a, int16x4_t b); // VSUBL.S16 q0,d0,d0 -_NEON2SSE_INLINE int32x4_t vsubl_s16(int16x4_t a, int16x4_t b) // VSUBL.S16 q0,d0,d0 -{ - __m128i a32, b32; - a32 = _MM_CVTEPI16_EPI32 (_pM128i(a)); //SSE4.1 - b32 = _MM_CVTEPI16_EPI32 (_pM128i(b)); //SSE4.1, - return _mm_sub_epi32 (a32, b32); -} - -int64x2_t vsubl_s32(int32x2_t a, int32x2_t b); // VSUBL.S32 q0,d0,d0 -_NEON2SSE_INLINE int64x2_t vsubl_s32(int32x2_t a, int32x2_t b) // VSUBL.S32 q0,d0,d0 -{ - //may be not optimal - __m128i a64, b64; - a64 = _MM_CVTEPI32_EPI64 (_pM128i(a)); //SSE4.1 - b64 = _MM_CVTEPI32_EPI64 (_pM128i(b)); //SSE4.1, - return _mm_sub_epi64 (a64, b64); -} - -uint16x8_t vsubl_u8(uint8x8_t a, uint8x8_t b); // VSUBL.U8 q0,d0,d0 -_NEON2SSE_INLINE uint16x8_t vsubl_u8(uint8x8_t a, uint8x8_t b) // VSUBL.U8 q0,d0,d0 -{ - __m128i a16, b16; - a16 = _MM_CVTEPU8_EPI16 (_pM128i(a)); //SSE4.1, - b16 = _MM_CVTEPU8_EPI16 (_pM128i(b)); //SSE4.1, - return _mm_sub_epi16 (a16, b16); -} - -uint32x4_t vsubl_u16(uint16x4_t a, uint16x4_t b); // VSUBL.s16 q0,d0,d0 -_NEON2SSE_INLINE uint32x4_t vsubl_u16(uint16x4_t a, uint16x4_t b) // VSUBL.s16 q0,d0,d0 -{ - __m128i a32, b32; - a32 = _MM_CVTEPU16_EPI32 (_pM128i(a)); //SSE4.1 - b32 = _MM_CVTEPU16_EPI32 (_pM128i(b)); //SSE4.1, - return _mm_sub_epi32 (a32, b32); -} - -uint64x2_t vsubl_u32(uint32x2_t a, uint32x2_t b); // VSUBL.U32 q0,d0,d0 -_NEON2SSE_INLINE uint64x2_t vsubl_u32(uint32x2_t a, uint32x2_t b) // VSUBL.U32 q0,d0,d0 -{ - //may be not optimal - __m128i a64, b64; - a64 = _MM_CVTEPU32_EPI64 (_pM128i(a)); //SSE4.1 - b64 = _MM_CVTEPU32_EPI64 (_pM128i(b)); //SSE4.1, - return _mm_sub_epi64 (a64, b64); -} - -//***************** Vector wide subtract: vsub -> Vr[i]:=Va[i]-Vb[i] ********************************** -//***************************************************************************************************** -int16x8_t vsubw_s8(int16x8_t a, int8x8_t b); // VSUBW.S8 q0,q0,d0 -_NEON2SSE_INLINE int16x8_t vsubw_s8(int16x8_t a, int8x8_t b) // VSUBW.S8 q0,q0,d0 -{ - __m128i b16; - b16 = _MM_CVTEPI8_EPI16 (_pM128i(b)); //SSE4.1, - return _mm_sub_epi16 (a, b16); -} - -int32x4_t vsubw_s16(int32x4_t a, int16x4_t b); // VSUBW.S16 q0,q0,d0 -_NEON2SSE_INLINE int32x4_t vsubw_s16(int32x4_t a, int16x4_t b) // VSUBW.S16 q0,q0,d0 -{ - __m128i b32; - b32 = _MM_CVTEPI16_EPI32 (_pM128i(b)); //SSE4.1, - return _mm_sub_epi32 (a, b32); -} - -int64x2_t vsubw_s32(int64x2_t a, int32x2_t b); // VSUBW.S32 q0,q0,d0 -_NEON2SSE_INLINE int64x2_t vsubw_s32(int64x2_t a, int32x2_t b) // VSUBW.S32 q0,q0,d0 -{ - __m128i b64; - b64 = _MM_CVTEPI32_EPI64 (_pM128i(b)); //SSE4.1 - return _mm_sub_epi64 (a, b64); -} - -uint16x8_t vsubw_u8(uint16x8_t a, uint8x8_t b); // VSUBW.U8 q0,q0,d0 -_NEON2SSE_INLINE uint16x8_t vsubw_u8(uint16x8_t a, uint8x8_t b) // VSUBW.U8 q0,q0,d0 -{ - __m128i b16; - b16 = _MM_CVTEPU8_EPI16 (_pM128i(b)); //SSE4.1, - return _mm_sub_epi16 (a, b16); -} - -uint32x4_t vsubw_u16(uint32x4_t a, uint16x4_t b); // VSUBW.s16 q0,q0,d0 -_NEON2SSE_INLINE uint32x4_t vsubw_u16(uint32x4_t a, uint16x4_t b) // VSUBW.s16 q0,q0,d0 -{ - __m128i b32; - b32 = _MM_CVTEPU16_EPI32 (_pM128i(b)); //SSE4.1, - return _mm_sub_epi32 (a, b32); -} - -uint64x2_t vsubw_u32(uint64x2_t a, uint32x2_t b); // VSUBW.U32 q0,q0,d0 -_NEON2SSE_INLINE uint64x2_t vsubw_u32(uint64x2_t a, uint32x2_t b) // VSUBW.U32 q0,q0,d0 -{ - __m128i b64; - b64 = _MM_CVTEPU32_EPI64 (_pM128i(b)); //SSE4.1 - return _mm_sub_epi64 (a, b64); -} - -//************************Vector saturating subtract ********************************* -//************************************************************************************* -int8x8_t vqsub_s8(int8x8_t a, int8x8_t b); // VQSUB.S8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vqsub_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(_mm_subs_epi8(_pM128i(a),_pM128i(b))); -} - - -int16x4_t vqsub_s16(int16x4_t a, int16x4_t b); // VQSUB.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vqsub_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(_mm_subs_epi16(_pM128i(a),_pM128i(b))); -} - - -int32x2_t vqsub_s32(int32x2_t a, int32x2_t b); // VQSUB.S32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vqsub_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(vqsubq_s32(_pM128i(a), _pM128i(b))); -} - - -int64x1_t vqsub_s64(int64x1_t a, int64x1_t b); // VQSUB.S64 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x1_t vqsub_s64(int64x1_t a, int64x1_t b), _NEON2SSE_REASON_SLOW_SERIAL) //no optimal SIMD soulution -{ - uint64x1_t res; - uint64_t a64,b64; - a64 = a.m64_u64[0]; - b64 = b.m64_u64[0]; - res.m64_u64[0] = a64 - b64; - - a64 = (a64 >> 63) + (~_SIGNBIT64); - if ((int64_t)((a64 ^ b64) & (a64 ^ res.m64_u64[0])) < 0) { - res.m64_u64[0] = a64; - } - return res; -} - -uint8x8_t vqsub_u8(uint8x8_t a, uint8x8_t b); // VQSUB.U8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vqsub_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - return64(_mm_subs_epu8(_pM128i(a),_pM128i(b))); -} - - -uint16x4_t vqsub_u16(uint16x4_t a, uint16x4_t b); // VQSUB.s16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vqsub_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64(_mm_subs_epu16(_pM128i(a),_pM128i(b))); -} - - -uint32x2_t vqsub_u32(uint32x2_t a, uint32x2_t b); // VQSUB.U32 d0,d0,d0 -_NEON2SSE_INLINE uint32x2_t vqsub_u32(uint32x2_t a, uint32x2_t b) -{ - uint32x2_t res64; - return64(vqsubq_u32(_pM128i(a), _pM128i(b))); -} - - -uint64x1_t vqsub_u64(uint64x1_t a, uint64x1_t b); // VQSUB.U64 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x1_t vqsub_u64(uint64x1_t a, uint64x1_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - uint64x1_t res; - uint64_t a64, b64; - a64 = _Ui64(a); - b64 = _Ui64(b); - if (a64 > b64) { - res.m64_u64[0] = a64 - b64; - } else { - res.m64_u64[0] = 0; - } - return res; -} - -int8x16_t vqsubq_s8(int8x16_t a, int8x16_t b); // VQSUB.S8 q0,q0,q0 -#define vqsubq_s8 _mm_subs_epi8 - -int16x8_t vqsubq_s16(int16x8_t a, int16x8_t b); // VQSUB.S16 q0,q0,q0 -#define vqsubq_s16 _mm_subs_epi16 - -int32x4_t vqsubq_s32(int32x4_t a, int32x4_t b); // VQSUB.S32 q0,q0,q0 -_NEON2SSE_INLINE int32x4_t vqsubq_s32(int32x4_t a, int32x4_t b) -{ - //no corresponding x86 SIMD soulution, special tricks are necessary. The overflow is possible only if a and b have opposite signs and sub has opposite sign to a - __m128i c7fffffff, res, res_sat, res_xor_a, b_xor_a; - c7fffffff = _mm_set1_epi32(0x7fffffff); - res = _mm_sub_epi32(a, b); - res_sat = _mm_srli_epi32(a, 31); - res_sat = _mm_add_epi32(res_sat, c7fffffff); - res_xor_a = _mm_xor_si128(res, a); - b_xor_a = _mm_xor_si128(b, a); - res_xor_a = _mm_and_si128(b_xor_a, res_xor_a); - res_xor_a = _mm_srai_epi32(res_xor_a,31); //propagate the sigh bit, all ffff if <0 all ones otherwise - res_sat = _mm_and_si128(res_xor_a, res_sat); - res = _mm_andnot_si128(res_xor_a, res); - return _mm_or_si128(res, res_sat); -} - -int64x2_t vqsubq_s64(int64x2_t a, int64x2_t b); // VQSUB.S64 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqsubq_s64(int64x2_t a, int64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) //no optimal SIMD soulution -{ - _NEON2SSE_ALIGN_16 int64_t atmp[2], btmp[2]; - _NEON2SSE_ALIGN_16 uint64_t res[2]; - _mm_store_si128((__m128i*)atmp, a); - _mm_store_si128((__m128i*)btmp, b); - res[0] = atmp[0] - btmp[0]; - res[1] = atmp[1] - btmp[1]; - if (((res[0] ^ atmp[0]) & _SIGNBIT64) && ((atmp[0] ^ btmp[0]) & _SIGNBIT64)) { - res[0] = (atmp[0] >> 63) ^ ~_SIGNBIT64; - } - if (((res[1] ^ atmp[1]) & _SIGNBIT64) && ((atmp[1] ^ btmp[1]) & _SIGNBIT64)) { - res[1] = (atmp[1] >> 63) ^ ~_SIGNBIT64; - } - return _mm_load_si128((__m128i*)res); -} - -uint8x16_t vqsubq_u8(uint8x16_t a, uint8x16_t b); // VQSUB.U8 q0,q0,q0 -#define vqsubq_u8 _mm_subs_epu8 - -uint16x8_t vqsubq_u16(uint16x8_t a, uint16x8_t b); // VQSUB.s16 q0,q0,q0 -#define vqsubq_u16 _mm_subs_epu16 - -uint32x4_t vqsubq_u32(uint32x4_t a, uint32x4_t b); // VQSUB.U32 q0,q0,q0 -_NEON2SSE_INLINE uint32x4_t vqsubq_u32(uint32x4_t a, uint32x4_t b) // VQSUB.U32 q0,q0,q0 -{ - __m128i min, mask, sub; - min = _MM_MIN_EPU32(a, b); //SSE4.1 - mask = _mm_cmpeq_epi32 (min, b); - sub = _mm_sub_epi32 (a, b); - return _mm_and_si128 ( sub, mask); -} - -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x2_t vqsubq_u64(uint64x2_t a, uint64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL); // VQSUB.U64 q0,q0,q0 -#ifdef USE_SSE4 - _NEON2SSE_INLINE uint64x2_t vqsubq_u64(uint64x2_t a, uint64x2_t b) - { - __m128i c80000000, subb, suba, cmp, sub; - c80000000 = _mm_set_epi32 (0x80000000, 0x0, 0x80000000, 0x0); - sub = _mm_sub_epi64 (a, b); - suba = _mm_sub_epi64 (a, c80000000); - subb = _mm_sub_epi64 (b, c80000000); - cmp = _mm_cmpgt_epi64 ( suba, subb); //no unsigned comparison, need to go to signed, SSE4.2!!! - return _mm_and_si128 (sub, cmp); //saturation - } -#else - _NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x2_t vqsubq_u64(uint64x2_t a, uint64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) - { - _NEON2SSE_ALIGN_16 uint64_t atmp[2], btmp[2], res[2]; - _mm_store_si128((__m128i*)atmp, a); - _mm_store_si128((__m128i*)btmp, b); - res[0] = (atmp[0] > btmp[0]) ? atmp[0] - btmp[0] : 0; - res[1] = (atmp[1] > btmp[1]) ? atmp[1] - btmp[1] : 0; - return _mm_load_si128((__m128i*)(res)); - } -#endif - -//**********Vector halving subtract Vr[i]:=(Va[i]-Vb[i])>>1 ****************************************************** -//**************************************************************** -int8x8_t vhsub_s8(int8x8_t a, int8x8_t b); // VHSUB.S8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vhsub_s8(int8x8_t a, int8x8_t b) // VHSUB.S8 d0,d0,d0 -{ - //no 8 bit shift available, internal overflow is possible, so let's go to 16 bit, - int8x8_t res64; - __m128i r16; - int8x8_t r; - r = vsub_s8 (a, b); - r16 = _MM_CVTEPI8_EPI16 (_pM128i(r)); //SSE 4.1 - r16 = _mm_srai_epi16 (r16, 1); //SSE2 - r16 = _mm_packs_epi16 (r16,r16); //use low 64 bits - return64(r16); -} - -int16x4_t vhsub_s16(int16x4_t a, int16x4_t b); // VHSUB.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vhsub_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(vhsubq_s16(_pM128i(a), _pM128i(b))); -} - - - -int32x2_t vhsub_s32(int32x2_t a, int32x2_t b); // VHSUB.S32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vhsub_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(vhsubq_s32(_pM128i(a), _pM128i(b))); -} - - -uint8x8_t vhsub_u8(uint8x8_t a, uint8x8_t b); // VHSUB.U8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vhsub_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - return64(vhsubq_u8(_pM128i(a), _pM128i(b))); -} - -uint16x4_t vhsub_u16(uint16x4_t a, uint16x4_t b); // VHSUB.s16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vhsub_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64(vhsubq_u16(_pM128i(a), _pM128i(b))); -} - -uint32x2_t vhsub_u32(uint32x2_t a, uint32x2_t b); // VHSUB.U32 d0,d0,d0 -_NEON2SSE_INLINE uint32x2_t vhsub_u32(uint32x2_t a, uint32x2_t b) -{ - uint32x2_t res64; - return64(vhsubq_u32(_pM128i(a), _pM128i(b))); -} - -int8x16_t vhsubq_s8(int8x16_t a, int8x16_t b); // VHSUB.S8 q0,q0,q0 -_NEON2SSE_INLINE int8x16_t vhsubq_s8(int8x16_t a, int8x16_t b) // VHSUB.S8 q0,q0,q0 -{ - // //need to deal with the possibility of internal overflow - __m128i c128, au,bu; - c128 = _mm_set1_epi8 (128); - au = _mm_add_epi8( a, c128); - bu = _mm_add_epi8( b, c128); - return vhsubq_u8(au,bu); -} - -int16x8_t vhsubq_s16(int16x8_t a, int16x8_t b); // VHSUB.S16 q0,q0,q0 -_NEON2SSE_INLINE int16x8_t vhsubq_s16(int16x8_t a, int16x8_t b) // VHSUB.S16 q0,q0,q0 -{ - //need to deal with the possibility of internal overflow - __m128i c8000, au,bu; - c8000 = _mm_set1_epi16(0x8000); - au = _mm_add_epi16( a, c8000); - bu = _mm_add_epi16( b, c8000); - return vhsubq_u16(au,bu); -} - -int32x4_t vhsubq_s32(int32x4_t a, int32x4_t b); // VHSUB.S32 q0,q0,q0 -_NEON2SSE_INLINE int32x4_t vhsubq_s32(int32x4_t a, int32x4_t b) // VHSUB.S32 q0,q0,q0 -{ - //need to deal with the possibility of internal overflow - __m128i a2, b2,r, b_1; - a2 = _mm_srai_epi32 (a,1); - b2 = _mm_srai_epi32 (b,1); - r = _mm_sub_epi32 (a2, b2); - b_1 = _mm_andnot_si128(a, b); //!a and b - b_1 = _mm_slli_epi32 (b_1,31); - b_1 = _mm_srli_epi32 (b_1,31); //0 or 1, last b bit - return _mm_sub_epi32(r,b_1); -} - -uint8x16_t vhsubq_u8(uint8x16_t a, uint8x16_t b); // VHSUB.U8 q0,q0,q0 -_NEON2SSE_INLINE uint8x16_t vhsubq_u8(uint8x16_t a, uint8x16_t b) // VHSUB.U8 q0,q0,q0 -{ - __m128i avg; - avg = _mm_avg_epu8 (a, b); - return _mm_sub_epi8(a, avg); -} - -uint16x8_t vhsubq_u16(uint16x8_t a, uint16x8_t b); // VHSUB.s16 q0,q0,q0 -_NEON2SSE_INLINE uint16x8_t vhsubq_u16(uint16x8_t a, uint16x8_t b) // VHSUB.s16 q0,q0,q0 -{ - __m128i avg; - avg = _mm_avg_epu16 (a, b); - return _mm_sub_epi16(a, avg); -} - -uint32x4_t vhsubq_u32(uint32x4_t a, uint32x4_t b); // VHSUB.U32 q0,q0,q0 -_NEON2SSE_INLINE uint32x4_t vhsubq_u32(uint32x4_t a, uint32x4_t b) // VHSUB.U32 q0,q0,q0 -{ - //need to deal with the possibility of internal overflow - __m128i a2, b2,r, b_1; - a2 = _mm_srli_epi32 (a,1); - b2 = _mm_srli_epi32 (b,1); - r = _mm_sub_epi32 (a2, b2); - b_1 = _mm_andnot_si128(a, b); //!a and b - b_1 = _mm_slli_epi32 (b_1,31); - b_1 = _mm_srli_epi32 (b_1,31); //0 or 1, last b bit - return _mm_sub_epi32(r,b_1); -} - -//******* Vector subtract high half (truncated) ** ************ -//************************************************************ -int8x8_t vsubhn_s16(int16x8_t a, int16x8_t b); // VSUBHN.I16 d0,q0,q0 -_NEON2SSE_INLINE int8x8_t vsubhn_s16(int16x8_t a, int16x8_t b) // VSUBHN.I16 d0,q0,q0 -{ - int8x8_t res64; - __m128i sum, sum8; - sum = _mm_sub_epi16 (a, b); - sum8 = _mm_srai_epi16 (sum, 8); - sum8 = _mm_packs_epi16(sum8,sum8); - return64(sum8); -} - -int16x4_t vsubhn_s32(int32x4_t a, int32x4_t b); // VSUBHN.I32 d0,q0,q0 -_NEON2SSE_INLINE int16x4_t vsubhn_s32(int32x4_t a, int32x4_t b) // VSUBHN.I32 d0,q0,q0 -{ - int16x4_t res64; - __m128i sum, sum16; - sum = _mm_sub_epi32 (a, b); - sum16 = _mm_srai_epi32 (sum, 16); - sum16 = _mm_packs_epi32(sum16,sum16); - return64(sum16); -} - -int32x2_t vsubhn_s64(int64x2_t a, int64x2_t b); // VSUBHN.I64 d0,q0,q0 -_NEON2SSE_INLINE int32x2_t vsubhn_s64(int64x2_t a, int64x2_t b) -{ - int32x2_t res64; - __m128i sub; - sub = _mm_sub_epi64 (a, b); - sub = _mm_shuffle_epi32(sub, 1 | (3 << 2) | (0 << 4) | (2 << 6)); - return64(sub); -} - -uint8x8_t vsubhn_u16(uint16x8_t a, uint16x8_t b); // VSUBHN.I16 d0,q0,q0 -_NEON2SSE_INLINE uint8x8_t vsubhn_u16(uint16x8_t a, uint16x8_t b) // VSUBHN.I16 d0,q0,q0 -{ - uint8x8_t res64; - __m128i sum, sum8; - sum = _mm_sub_epi16 (a, b); - sum8 = _mm_srli_epi16 (sum, 8); - sum8 = _mm_packus_epi16(sum8,sum8); - return64(sum8); -} - -uint16x4_t vsubhn_u32(uint32x4_t a, uint32x4_t b); // VSUBHN.I32 d0,q0,q0 -_NEON2SSE_INLINE uint16x4_t vsubhn_u32(uint32x4_t a, uint32x4_t b) // VSUBHN.I32 d0,q0,q0 -{ - uint16x4_t res64; - __m128i sum, sum16; - sum = _mm_sub_epi32 (a, b); - sum16 = _mm_srli_epi32 (sum, 16); - sum16 = _MM_PACKUS1_EPI32(sum16); - return64(sum16); -} - -uint32x2_t vsubhn_u64(uint64x2_t a, uint64x2_t b); // VSUBHN.I64 d0,q0,q0 -#define vsubhn_u64 vsubhn_s64 - -//************ Vector rounding subtract high half ********************* -//********************************************************************* -int8x8_t vrsubhn_s16(int16x8_t a, int16x8_t b); // VRSUBHN.I16 d0,q0,q0 -_NEON2SSE_INLINE int8x8_t vrsubhn_s16(int16x8_t a, int16x8_t b) // VRSUBHN.I16 d0,q0,q0 -{ - int8x8_t res64; - __m128i sub, mask1; - sub = _mm_sub_epi16 (a, b); - mask1 = _mm_slli_epi16(sub, 9); //shift left then back right to - mask1 = _mm_srli_epi16(mask1, 15); //get 7-th bit 1 or zero - sub = _mm_srai_epi16 (sub, 8); //get high half - sub = _mm_add_epi16 (sub, mask1); //actual rounding - sub = _mm_packs_epi16 (sub, sub); - return64(sub); -} - -int16x4_t vrsubhn_s32(int32x4_t a, int32x4_t b); // VRSUBHN.I32 d0,q0,q0 -_NEON2SSE_INLINE int16x4_t vrsubhn_s32(int32x4_t a, int32x4_t b) // VRSUBHN.I32 d0,q0,q0 -{ - //SIMD may be not optimal, serial may be faster - int16x4_t res64; - __m128i sub, mask1; - sub = _mm_sub_epi32 (a, b); - mask1 = _mm_slli_epi32(sub, 17); //shift left then back right to - mask1 = _mm_srli_epi32(mask1,31); //get 15-th bit 1 or zero - sub = _mm_srai_epi32 (sub, 16); //get high half - sub = _mm_add_epi32 (sub, mask1); //actual rounding - sub = _mm_packs_epi32 (sub, sub); - return64(sub); -} - -int32x2_t vrsubhn_s64(int64x2_t a, int64x2_t b); // VRSUBHN.I64 d0,q0,q0 -_NEON2SSE_INLINE int32x2_t vrsubhn_s64(int64x2_t a, int64x2_t b) -{ - //SIMD may be not optimal, serial may be faster - int32x2_t res64; - __m128i sub, mask1; - sub = _mm_sub_epi64 (a, b); - mask1 = _mm_slli_epi64(sub, 33); //shift left then back right to - mask1 = _mm_srli_epi64(mask1,32); //get 31-th bit 1 or zero - sub = _mm_add_epi64 (sub, mask1); //actual high half rounding - sub = _mm_shuffle_epi32(sub, 1 | (3 << 2) | (0 << 4) | (2 << 6)); - return64(sub); -} - -uint8x8_t vrsubhn_u16(uint16x8_t a, uint16x8_t b); // VRSUBHN.I16 d0,q0,q0 -_NEON2SSE_INLINE uint8x8_t vrsubhn_u16(uint16x8_t a, uint16x8_t b) // VRSUBHN.I16 d0,q0,q0 -{ - uint8x8_t res64; - __m128i sub, mask1; - sub = _mm_sub_epi16 (a, b); - mask1 = _mm_slli_epi16(sub, 9); //shift left then back right to - mask1 = _mm_srli_epi16(mask1, 15); //get 7-th bit 1 or zero - sub = _mm_srai_epi16 (sub, 8); //get high half - sub = _mm_add_epi16 (sub, mask1); //actual rounding - sub = _mm_packus_epi16 (sub, sub); - return64(sub); -} - -uint16x4_t vrsubhn_u32(uint32x4_t a, uint32x4_t b); // VRSUBHN.I32 d0,q0,q0 -_NEON2SSE_INLINE uint16x4_t vrsubhn_u32(uint32x4_t a, uint32x4_t b) // VRSUBHN.I32 d0,q0,q0 -{ - //SIMD may be not optimal, serial may be faster - uint16x4_t res64; - __m128i sub, mask1; - sub = _mm_sub_epi32 (a, b); - mask1 = _mm_slli_epi32(sub, 17); //shift left then back right to - mask1 = _mm_srli_epi32(mask1,31); //get 15-th bit 1 or zero - sub = _mm_srai_epi32 (sub, 16); //get high half - sub = _mm_add_epi32 (sub, mask1); //actual rounding - sub = _MM_PACKUS1_EPI32 (sub); - return64(sub); -} - -uint32x2_t vrsubhn_u64(uint64x2_t a, uint64x2_t b); // VRSUBHN.I64 d0,q0,q0 -#define vrsubhn_u64 vrsubhn_s64 - -//*********** Vector saturating doubling multiply subtract long ******************** -//************************************************************************************ -int32x4_t vqdmlsl_s16(int32x4_t a, int16x4_t b, int16x4_t c); // VQDMLSL.S16 q0,d0,d0 -_NEON2SSE_INLINE int32x4_t vqdmlsl_s16(int32x4_t a, int16x4_t b, int16x4_t c) -{ - //not optimal SIMD soulution, serial may be faster - __m128i res32, mask; - int32x4_t res; - _NEON2SSE_ALIGN_16 uint32_t cmask[] = {0x80000000, 0x80000000, 0x80000000, 0x80000000}; - res = vmull_s16(b, c); - res32 = _mm_slli_epi32 (res, 1); //double the result, saturation not considered - mask = _mm_cmpeq_epi32 (res32, *(__m128i*)cmask); - res32 = _mm_xor_si128 (res32, mask); //res32 saturated for 0x80000000 - return vqsubq_s32(a, res32); //saturation -} - -int64x2_t vqdmlsl_s32(int64x2_t a, int32x2_t b, int32x2_t c); // VQDMLSL.S32 q0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqdmlsl_s32(int64x2_t a, int32x2_t b, int32x2_t c), _NEON2SSE_REASON_SLOW_SERIAL) -{ - __m128i res64, mask; - int64x2_t res; - _NEON2SSE_ALIGN_16 uint64_t cmask[] = {0x8000000000000000, 0x8000000000000000}; - res = vmull_s32(b, c); - res64 = _mm_slli_epi64 (res, 1); //double the result, saturation not considered - mask = _MM_CMPEQ_EPI64 (res64, *(__m128i*)cmask); - res64 = _mm_xor_si128 (res64, mask); //res32 saturated for 0x80000000 - return vqsubq_s64(a, res64); //saturation -} - -//****************** COMPARISON *************************************** -//******************* Vector compare equal ************************************* -//**************************************************************************** -uint8x8_t vceq_s8(int8x8_t a, int8x8_t b); // VCEQ.I8 d0, d0, d0 -_NEON2SSE_INLINE int8x8_t vceq_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(_mm_cmpeq_epi8(_pM128i(a),_pM128i(b))); -} - - -uint16x4_t vceq_s16(int16x4_t a, int16x4_t b); // VCEQ.I16 d0, d0, d0 -_NEON2SSE_INLINE int16x4_t vceq_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(_mm_cmpeq_epi16(_pM128i(a),_pM128i(b))); -} - - -uint32x2_t vceq_s32(int32x2_t a, int32x2_t b); // VCEQ.I32 d0, d0, d0 -_NEON2SSE_INLINE int32x2_t vceq_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(_mm_cmpeq_epi32(_pM128i(a),_pM128i(b))); -} - - -uint32x2_t vceq_f32(float32x2_t a, float32x2_t b); // VCEQ.F32 d0, d0, d0 -_NEON2SSE_INLINE uint32x2_t vceq_f32(float32x2_t a, float32x2_t b) -{ - uint32x2_t res64; - __m128 res; - res = _mm_cmpeq_ps(_pM128(a), _pM128(b) ); - return64f(res); -} - -uint8x8_t vceq_u8(uint8x8_t a, uint8x8_t b); // VCEQ.I8 d0, d0, d0 -_NEON2SSE_INLINE uint8x8_t vceq_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - return64(_mm_cmpeq_epi8(_pM128i(a),_pM128i(b))); -} - - -uint16x4_t vceq_u16(uint16x4_t a, uint16x4_t b); // VCEQ.I16 d0, d0, d0 -_NEON2SSE_INLINE uint16x4_t vceq_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64(_mm_cmpeq_epi16(_pM128i(a),_pM128i(b))); -} - - -uint32x2_t vceq_u32(uint32x2_t a, uint32x2_t b); // VCEQ.I32 d0, d0, d0 -_NEON2SSE_INLINE uint32x2_t vceq_u32(uint32x2_t a, uint32x2_t b) -{ - uint32x2_t res64; - return64(_mm_cmpeq_epi32(_pM128i(a),_pM128i(b))); -} - - -uint8x8_t vceq_p8(poly8x8_t a, poly8x8_t b); // VCEQ.I8 d0, d0, d0 -#define vceq_p8 vceq_u8 - - -uint8x16_t vceqq_s8(int8x16_t a, int8x16_t b); // VCEQ.I8 q0, q0, q0 -#define vceqq_s8 _mm_cmpeq_epi8 - -uint16x8_t vceqq_s16(int16x8_t a, int16x8_t b); // VCEQ.I16 q0, q0, q0 -#define vceqq_s16 _mm_cmpeq_epi16 - -uint32x4_t vceqq_s32(int32x4_t a, int32x4_t b); // VCEQ.I32 q0, q0, q0 -#define vceqq_s32 _mm_cmpeq_epi32 - -uint32x4_t vceqq_f32(float32x4_t a, float32x4_t b); // VCEQ.F32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vceqq_f32(float32x4_t a, float32x4_t b) -{ - __m128 res; - res = _mm_cmpeq_ps(a,b); - return _M128i(res); -} - -uint8x16_t vceqq_u8(uint8x16_t a, uint8x16_t b); // VCEQ.I8 q0, q0, q0 -#define vceqq_u8 _mm_cmpeq_epi8 - -uint16x8_t vceqq_u16(uint16x8_t a, uint16x8_t b); // VCEQ.I16 q0, q0, q0 -#define vceqq_u16 _mm_cmpeq_epi16 - -uint32x4_t vceqq_u32(uint32x4_t a, uint32x4_t b); // VCEQ.I32 q0, q0, q0 -#define vceqq_u32 _mm_cmpeq_epi32 - -uint8x16_t vceqq_p8(poly8x16_t a, poly8x16_t b); // VCEQ.I8 q0, q0, q0 -#define vceqq_p8 _mm_cmpeq_epi8 - -//******************Vector compare greater-than or equal************************* -//******************************************************************************* -//in IA SIMD no greater-than-or-equal comparison for integers, -// there is greater-than available only, so we need the following tricks - -uint8x8_t vcge_s8(int8x8_t a, int8x8_t b); // VCGE.S8 d0, d0, d0 -_NEON2SSE_INLINE int8x8_t vcge_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(vcgeq_s8(_pM128i(a), _pM128i(b))); -} - - -uint16x4_t vcge_s16(int16x4_t a, int16x4_t b); // VCGE.S16 d0, d0, d0 -_NEON2SSE_INLINE int16x4_t vcge_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(vcgeq_s16(_pM128i(a), _pM128i(b))); -} - - -uint32x2_t vcge_s32(int32x2_t a, int32x2_t b); // VCGE.S32 d0, d0, d0 -_NEON2SSE_INLINE int32x2_t vcge_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(vcgeq_s32(_pM128i(a), _pM128i(b))); -} - - -uint32x2_t vcge_f32(float32x2_t a, float32x2_t b); // VCGE.F32 d0, d0, d0 -_NEON2SSE_INLINE uint32x2_t vcge_f32(float32x2_t a, float32x2_t b) -{ - uint32x2_t res64; - __m128 res; - res = _mm_cmpge_ps(_pM128(a),_pM128(b)); //use only 2 first entries - return64f(res); -} - -uint8x8_t vcge_u8(uint8x8_t a, uint8x8_t b); // VCGE.U8 d0, d0, d0 -_NEON2SSE_INLINE uint8x8_t vcge_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - return64(vcgeq_u8(_pM128i(a), _pM128i(b))); -} - - -uint16x4_t vcge_u16(uint16x4_t a, uint16x4_t b); // VCGE.s16 d0, d0, d0 -_NEON2SSE_INLINE uint16x4_t vcge_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64(vcgeq_u16(_pM128i(a), _pM128i(b))); -} - - -uint32x2_t vcge_u32(uint32x2_t a, uint32x2_t b); // VCGE.U32 d0, d0, d0 -_NEON2SSE_INLINE uint32x2_t vcge_u32(uint32x2_t a, uint32x2_t b) -{ - //serial solution looks faster - uint32x2_t res64; - return64(vcgeq_u32 (_pM128i(a), _pM128i(b))); -} - - - -uint8x16_t vcgeq_s8(int8x16_t a, int8x16_t b); // VCGE.S8 q0, q0, q0 -_NEON2SSE_INLINE uint8x16_t vcgeq_s8(int8x16_t a, int8x16_t b) // VCGE.S8 q0, q0, q0 -{ - __m128i m1, m2; - m1 = _mm_cmpgt_epi8 ( a, b); - m2 = _mm_cmpeq_epi8 ( a, b); - return _mm_or_si128 ( m1, m2); -} - -uint16x8_t vcgeq_s16(int16x8_t a, int16x8_t b); // VCGE.S16 q0, q0, q0 -_NEON2SSE_INLINE uint16x8_t vcgeq_s16(int16x8_t a, int16x8_t b) // VCGE.S16 q0, q0, q0 -{ - __m128i m1, m2; - m1 = _mm_cmpgt_epi16 ( a, b); - m2 = _mm_cmpeq_epi16 ( a, b); - return _mm_or_si128 ( m1,m2); -} - -uint32x4_t vcgeq_s32(int32x4_t a, int32x4_t b); // VCGE.S32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vcgeq_s32(int32x4_t a, int32x4_t b) // VCGE.S32 q0, q0, q0 -{ - __m128i m1, m2; - m1 = _mm_cmpgt_epi32 (a, b); - m2 = _mm_cmpeq_epi32 (a, b); - return _mm_or_si128 (m1, m2); -} - -uint32x4_t vcgeq_f32(float32x4_t a, float32x4_t b); // VCGE.F32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vcgeq_f32(float32x4_t a, float32x4_t b) -{ - __m128 res; - res = _mm_cmpge_ps(a,b); //use only 2 first entries - return *(__m128i*)&res; -} - -uint8x16_t vcgeq_u8(uint8x16_t a, uint8x16_t b); // VCGE.U8 q0, q0, q0 -_NEON2SSE_INLINE uint8x16_t vcgeq_u8(uint8x16_t a, uint8x16_t b) // VCGE.U8 q0, q0, q0 -{ - //no unsigned chars comparison, only signed available,so need the trick - #ifdef USE_SSE4 - __m128i cmp; - cmp = _mm_max_epu8(a, b); - return _mm_cmpeq_epi8(cmp, a); //a>=b - #else - __m128i c128, as, bs, m1, m2; - c128 = _mm_set1_epi8 (128); - as = _mm_sub_epi8( a, c128); - bs = _mm_sub_epi8( b, c128); - m1 = _mm_cmpgt_epi8( as, bs); - m2 = _mm_cmpeq_epi8 (as, bs); - return _mm_or_si128 ( m1, m2); - #endif -} - -uint16x8_t vcgeq_u16(uint16x8_t a, uint16x8_t b); // VCGE.s16 q0, q0, q0 -_NEON2SSE_INLINE uint16x8_t vcgeq_u16(uint16x8_t a, uint16x8_t b) // VCGE.s16 q0, q0, q0 -{ - //no unsigned shorts comparison, only signed available,so need the trick - #ifdef USE_SSE4 - __m128i cmp; - cmp = _mm_max_epu16(a, b); - return _mm_cmpeq_epi16(cmp, a); //a>=b - #else - __m128i c8000, as, bs, m1, m2; - c8000 = _mm_set1_epi16 (0x8000); - as = _mm_sub_epi16(a,c8000); - bs = _mm_sub_epi16(b,c8000); - m1 = _mm_cmpgt_epi16(as, bs); - m2 = _mm_cmpeq_epi16 (as, bs); - return _mm_or_si128 ( m1, m2); - #endif -} - -uint32x4_t vcgeq_u32(uint32x4_t a, uint32x4_t b); // VCGE.U32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vcgeq_u32(uint32x4_t a, uint32x4_t b) // VCGE.U32 q0, q0, q0 -{ - //no unsigned ints comparison, only signed available,so need the trick - #ifdef USE_SSE4 - __m128i cmp; - cmp = _mm_max_epu32(a, b); - return _mm_cmpeq_epi32(cmp, a); //a>=b - #else - //serial solution may be faster - __m128i c80000000, as, bs, m1, m2; - c80000000 = _mm_set1_epi32 (0x80000000); - as = _mm_sub_epi32(a,c80000000); - bs = _mm_sub_epi32(b,c80000000); - m1 = _mm_cmpgt_epi32 (as, bs); - m2 = _mm_cmpeq_epi32 (as, bs); - return _mm_or_si128 ( m1, m2); - #endif -} - -//**********************Vector compare less-than or equal****************************** -//*************************************************************************************** -//in IA SIMD no less-than-or-equal comparison for integers present, so we need the tricks - -uint8x8_t vcle_s8(int8x8_t a, int8x8_t b); // VCGE.S8 d0, d0, d0 -_NEON2SSE_INLINE int8x8_t vcle_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(vcleq_s8(_pM128i(a), _pM128i(b))); -} - - -uint16x4_t vcle_s16(int16x4_t a, int16x4_t b); // VCGE.S16 d0, d0, d0 -_NEON2SSE_INLINE int16x4_t vcle_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(vcleq_s16(_pM128i(a), _pM128i(b))); -} - - -uint32x2_t vcle_s32(int32x2_t a, int32x2_t b); // VCGE.S32 d0, d0, d0 -_NEON2SSE_INLINE int32x2_t vcle_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(vcleq_s32(_pM128i(a), _pM128i(b))); -} - - -uint32x2_t vcle_f32(float32x2_t a, float32x2_t b); // VCGE.F32 d0, d0, d0? -_NEON2SSE_INLINE uint32x2_t vcle_f32(float32x2_t a, float32x2_t b) -{ - uint32x2_t res64; - __m128 res; - res = _mm_cmple_ps(_pM128(a),_pM128(b)); - return64f(res); -} - -uint8x8_t vcle_u8(uint8x8_t a, uint8x8_t b); // VCGE.U8 d0, d0, d0 -#define vcle_u8(a,b) vcge_u8(b,a) - - -uint16x4_t vcle_u16(uint16x4_t a, uint16x4_t b); // VCGE.s16 d0, d0, d0 -#define vcle_u16(a,b) vcge_u16(b,a) - - -uint32x2_t vcle_u32(uint32x2_t a, uint32x2_t b); // VCGE.U32 d0, d0, d0 -#define vcle_u32(a,b) vcge_u32(b,a) - -uint8x16_t vcleq_s8(int8x16_t a, int8x16_t b); // VCGE.S8 q0, q0, q0 -_NEON2SSE_INLINE uint8x16_t vcleq_s8(int8x16_t a, int8x16_t b) // VCGE.S8 q0, q0, q0 -{ - __m128i c1, res; - c1 = _mm_cmpeq_epi8 (a,a); //all ones 0xff.... - res = _mm_cmpgt_epi8 ( a, b); - return _mm_andnot_si128 (res, c1); //inverse the cmpgt result, get less-than-or-equal -} - -uint16x8_t vcleq_s16(int16x8_t a, int16x8_t b); // VCGE.S16 q0, q0, q0 -_NEON2SSE_INLINE uint16x8_t vcleq_s16(int16x8_t a, int16x8_t b) // VCGE.S16 q0, q0, q0 -{ - __m128i c1, res; - c1 = _mm_cmpeq_epi16 (a,a); //all ones 0xff.... - res = _mm_cmpgt_epi16 ( a, b); - return _mm_andnot_si128 (res, c1); -} - -uint32x4_t vcleq_s32(int32x4_t a, int32x4_t b); // VCGE.S32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vcleq_s32(int32x4_t a, int32x4_t b) // VCGE.S32 q0, q0, q0 -{ - __m128i c1, res; - c1 = _mm_cmpeq_epi32 (a,a); //all ones 0xff.... - res = _mm_cmpgt_epi32 ( a, b); - return _mm_andnot_si128 (res, c1); -} - -uint32x4_t vcleq_f32(float32x4_t a, float32x4_t b); // VCGE.F32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vcleq_f32(float32x4_t a, float32x4_t b) -{ - __m128 res; - res = _mm_cmple_ps(a,b); - return *(__m128i*)&res; -} - -uint8x16_t vcleq_u8(uint8x16_t a, uint8x16_t b); // VCGE.U8 q0, q0, q0 -#ifdef USE_SSE4 - _NEON2SSE_INLINE uint8x16_t vcleq_u8(uint8x16_t a, uint8x16_t b) // VCGE.U8 q0, q0, q0 - { - //no unsigned chars comparison in SSE, only signed available,so need the trick - __m128i cmp; - cmp = _mm_min_epu8(a, b); - return _mm_cmpeq_epi8(cmp, a); //a<=b - } -#else - #define vcleq_u8(a,b) vcgeq_u8(b,a) -#endif - - -uint16x8_t vcleq_u16(uint16x8_t a, uint16x8_t b); // VCGE.s16 q0, q0, q0 -#ifdef USE_SSE4 - _NEON2SSE_INLINE uint16x8_t vcleq_u16(uint16x8_t a, uint16x8_t b) // VCGE.s16 q0, q0, q0 - { - //no unsigned shorts comparison in SSE, only signed available,so need the trick - __m128i cmp; - cmp = _mm_min_epu16(a, b); - return _mm_cmpeq_epi16(cmp, a); //a<=b - } -#else - #define vcleq_u16(a,b) vcgeq_u16(b,a) -#endif - - -uint32x4_t vcleq_u32(uint32x4_t a, uint32x4_t b); // VCGE.U32 q0, q0, q0 -#ifdef USE_SSE4 - _NEON2SSE_INLINE uint32x4_t vcleq_u32(uint32x4_t a, uint32x4_t b) // VCGE.U32 q0, q0, q0 - { - //no unsigned chars comparison in SSE, only signed available,so need the trick - __m128i cmp; - cmp = _mm_min_epu32(a, b); - return _mm_cmpeq_epi32(cmp, a); //a<=b - } -#else -//solution may be not optimal compared with the serial one - #define vcleq_u32(a,b) vcgeq_u32(b,a) -#endif - - -//****** Vector compare greater-than ****************************************** -//************************************************************************** -uint8x8_t vcgt_s8(int8x8_t a, int8x8_t b); // VCGT.S8 d0, d0, d0 -_NEON2SSE_INLINE int8x8_t vcgt_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(_mm_cmpgt_epi8(_pM128i(a),_pM128i(b))); -} - - -uint16x4_t vcgt_s16(int16x4_t a, int16x4_t b); // VCGT.S16 d0, d0, d0 -_NEON2SSE_INLINE int16x4_t vcgt_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(_mm_cmpgt_epi16(_pM128i(a),_pM128i(b))); -} - - -uint32x2_t vcgt_s32(int32x2_t a, int32x2_t b); // VCGT.S32 d0, d0, d0 -_NEON2SSE_INLINE int32x2_t vcgt_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(_mm_cmpgt_epi32(_pM128i(a),_pM128i(b))); -} - - -uint32x2_t vcgt_f32(float32x2_t a, float32x2_t b); // VCGT.F32 d0, d0, d0 -_NEON2SSE_INLINE uint32x2_t vcgt_f32(float32x2_t a, float32x2_t b) -{ - uint32x2_t res64; - __m128 res; - res = _mm_cmpgt_ps(_pM128(a),_pM128(b)); //use only 2 first entries - return64f(res); -} - -uint8x8_t vcgt_u8(uint8x8_t a, uint8x8_t b); // VCGT.U8 d0, d0, d0 -_NEON2SSE_INLINE uint8x8_t vcgt_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - return64(vcgtq_u8(_pM128i(a), _pM128i(b))); -} - - -uint16x4_t vcgt_u16(uint16x4_t a, uint16x4_t b); // VCGT.s16 d0, d0, d0 -_NEON2SSE_INLINE uint16x4_t vcgt_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64(vcgtq_u16(_pM128i(a), _pM128i(b))); -} - - -uint32x2_t vcgt_u32(uint32x2_t a, uint32x2_t b); // VCGT.U32 d0, d0, d0 -_NEON2SSE_INLINE uint32x2_t vcgt_u32(uint32x2_t a, uint32x2_t b) -{ - uint32x2_t res64; - return64(vcgtq_u32(_pM128i(a), _pM128i(b))); -} - - -uint8x16_t vcgtq_s8(int8x16_t a, int8x16_t b); // VCGT.S8 q0, q0, q0 -#define vcgtq_s8 _mm_cmpgt_epi8 - -uint16x8_t vcgtq_s16(int16x8_t a, int16x8_t b); // VCGT.S16 q0, q0, q0 -#define vcgtq_s16 _mm_cmpgt_epi16 - -uint32x4_t vcgtq_s32(int32x4_t a, int32x4_t b); // VCGT.S32 q0, q0, q0 -#define vcgtq_s32 _mm_cmpgt_epi32 - -uint32x4_t vcgtq_f32(float32x4_t a, float32x4_t b); // VCGT.F32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vcgtq_f32(float32x4_t a, float32x4_t b) -{ - __m128 res; - res = _mm_cmpgt_ps(a,b); //use only 2 first entries - return *(__m128i*)&res; -} - -uint8x16_t vcgtq_u8(uint8x16_t a, uint8x16_t b); // VCGT.U8 q0, q0, q0 -_NEON2SSE_INLINE uint8x16_t vcgtq_u8(uint8x16_t a, uint8x16_t b) // VCGT.U8 q0, q0, q0 -{ - //no unsigned chars comparison, only signed available,so need the trick - __m128i c128, as, bs; - c128 = _mm_set1_epi8 (128); - as = _mm_sub_epi8(a,c128); - bs = _mm_sub_epi8(b,c128); - return _mm_cmpgt_epi8 (as, bs); -} - -uint16x8_t vcgtq_u16(uint16x8_t a, uint16x8_t b); // VCGT.s16 q0, q0, q0 -_NEON2SSE_INLINE uint16x8_t vcgtq_u16(uint16x8_t a, uint16x8_t b) // VCGT.s16 q0, q0, q0 -{ - //no unsigned short comparison, only signed available,so need the trick - __m128i c8000, as, bs; - c8000 = _mm_set1_epi16 (0x8000); - as = _mm_sub_epi16(a,c8000); - bs = _mm_sub_epi16(b,c8000); - return _mm_cmpgt_epi16 ( as, bs); -} - -uint32x4_t vcgtq_u32(uint32x4_t a, uint32x4_t b); // VCGT.U32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vcgtq_u32(uint32x4_t a, uint32x4_t b) // VCGT.U32 q0, q0, q0 -{ - //no unsigned int comparison, only signed available,so need the trick - __m128i c80000000, as, bs; - c80000000 = _mm_set1_epi32 (0x80000000); - as = _mm_sub_epi32(a,c80000000); - bs = _mm_sub_epi32(b,c80000000); - return _mm_cmpgt_epi32 ( as, bs); -} - -//********************* Vector compare less-than ************************** -//************************************************************************* -uint8x8_t vclt_s8(int8x8_t a, int8x8_t b); // VCGT.S8 d0, d0, d0 -#define vclt_s8(a,b) vcgt_s8(b,a) //swap the arguments!! - - -uint16x4_t vclt_s16(int16x4_t a, int16x4_t b); // VCGT.S16 d0, d0, d0 -#define vclt_s16(a,b) vcgt_s16(b,a) //swap the arguments!! - - -uint32x2_t vclt_s32(int32x2_t a, int32x2_t b); // VCGT.S32 d0, d0, d0 -#define vclt_s32(a,b) vcgt_s32(b,a) //swap the arguments!! - - -uint32x2_t vclt_f32(float32x2_t a, float32x2_t b); // VCGT.F32 d0, d0, d0 -#define vclt_f32(a,b) vcgt_f32(b, a) //swap the arguments!! - -uint8x8_t vclt_u8(uint8x8_t a, uint8x8_t b); // VCGT.U8 d0, d0, d0 -#define vclt_u8(a,b) vcgt_u8(b,a) //swap the arguments!! - -uint16x4_t vclt_u16(uint16x4_t a, uint16x4_t b); // VCGT.s16 d0, d0, d0 -#define vclt_u16(a,b) vcgt_u16(b,a) //swap the arguments!! - -uint32x2_t vclt_u32(uint32x2_t a, uint32x2_t b); // VCGT.U32 d0, d0, d0 -#define vclt_u32(a,b) vcgt_u32(b,a) //swap the arguments!! - -uint8x16_t vcltq_s8(int8x16_t a, int8x16_t b); // VCGT.S8 q0, q0, q0 -#define vcltq_s8(a,b) vcgtq_s8(b, a) //swap the arguments!! - -uint16x8_t vcltq_s16(int16x8_t a, int16x8_t b); // VCGT.S16 q0, q0, q0 -#define vcltq_s16(a,b) vcgtq_s16(b, a) //swap the arguments!! - -uint32x4_t vcltq_s32(int32x4_t a, int32x4_t b); // VCGT.S32 q0, q0, q0 -#define vcltq_s32(a,b) vcgtq_s32(b, a) //swap the arguments!! - -uint32x4_t vcltq_f32(float32x4_t a, float32x4_t b); // VCGT.F32 q0, q0, q0 -#define vcltq_f32(a,b) vcgtq_f32(b, a) //swap the arguments!! - -uint8x16_t vcltq_u8(uint8x16_t a, uint8x16_t b); // VCGT.U8 q0, q0, q0 -#define vcltq_u8(a,b) vcgtq_u8(b, a) //swap the arguments!! - -uint16x8_t vcltq_u16(uint16x8_t a, uint16x8_t b); // VCGT.s16 q0, q0, q0 -#define vcltq_u16(a,b) vcgtq_u16(b, a) //swap the arguments!! - -uint32x4_t vcltq_u32(uint32x4_t a, uint32x4_t b); // VCGT.U32 q0, q0, q0 -#define vcltq_u32(a,b) vcgtq_u32(b, a) //swap the arguments!! - -//*****************Vector compare absolute greater-than or equal ************ -//*************************************************************************** -uint32x2_t vcage_f32(float32x2_t a, float32x2_t b); // VACGE.F32 d0, d0, d0 -_NEON2SSE_INLINE uint32x2_t vcage_f32(float32x2_t a, float32x2_t b) -{ - uint32x2_t res64; - __m128i c7fffffff; - __m128 a0, b0; - c7fffffff = _mm_set1_epi32 (0x7fffffff); - a0 = _mm_and_ps (_pM128(a), *(__m128*)&c7fffffff); - b0 = _mm_and_ps (_pM128(b), *(__m128*)&c7fffffff); - a0 = _mm_cmpge_ps ( a0, b0); - return64f(a0); -} - -uint32x4_t vcageq_f32(float32x4_t a, float32x4_t b); // VACGE.F32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vcageq_f32(float32x4_t a, float32x4_t b) // VACGE.F32 q0, q0, q0 -{ - __m128i c7fffffff; - __m128 a0, b0; - c7fffffff = _mm_set1_epi32 (0x7fffffff); - a0 = _mm_and_ps (a, *(__m128*)&c7fffffff); - b0 = _mm_and_ps (b, *(__m128*)&c7fffffff); - a0 = _mm_cmpge_ps ( a0, b0); - return (*(__m128i*)&a0); -} - -//********Vector compare absolute less-than or equal ****************** -//******************************************************************** -uint32x2_t vcale_f32(float32x2_t a, float32x2_t b); // VACGE.F32 d0, d0, d0 -_NEON2SSE_INLINE uint32x2_t vcale_f32(float32x2_t a, float32x2_t b) -{ - uint32x2_t res64; - __m128i c7fffffff; - __m128 a0, b0; - c7fffffff = _mm_set1_epi32 (0x7fffffff); - a0 = _mm_and_ps (_pM128(a), *(__m128*)&c7fffffff); - b0 = _mm_and_ps (_pM128(b), *(__m128*)&c7fffffff); - a0 = _mm_cmple_ps (a0, b0); - return64f(a0); -} - -uint32x4_t vcaleq_f32(float32x4_t a, float32x4_t b); // VACGE.F32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vcaleq_f32(float32x4_t a, float32x4_t b) // VACGE.F32 q0, q0, q0 -{ - __m128i c7fffffff; - __m128 a0, b0; - c7fffffff = _mm_set1_epi32 (0x7fffffff); - a0 = _mm_and_ps (a, *(__m128*)&c7fffffff); - b0 = _mm_and_ps (b, *(__m128*)&c7fffffff); - a0 = _mm_cmple_ps (a0, b0); - return (*(__m128i*)&a0); -} - -//******** Vector compare absolute greater-than ****************** -//****************************************************************** -uint32x2_t vcagt_f32(float32x2_t a, float32x2_t b); // VACGT.F32 d0, d0, d0 -_NEON2SSE_INLINE uint32x2_t vcagt_f32(float32x2_t a, float32x2_t b) -{ - uint32x2_t res64; - __m128i c7fffffff; - __m128 a0, b0; - c7fffffff = _mm_set1_epi32 (0x7fffffff); - a0 = _mm_and_ps (_pM128(a), *(__m128*)&c7fffffff); - b0 = _mm_and_ps (_pM128(b), *(__m128*)&c7fffffff); - a0 = _mm_cmpgt_ps (a0, b0); - return64f(a0); -} - -uint32x4_t vcagtq_f32(float32x4_t a, float32x4_t b); // VACGT.F32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vcagtq_f32(float32x4_t a, float32x4_t b) // VACGT.F32 q0, q0, q0 -{ - __m128i c7fffffff; - __m128 a0, b0; - c7fffffff = _mm_set1_epi32 (0x7fffffff); - a0 = _mm_and_ps (a, *(__m128*)&c7fffffff); - b0 = _mm_and_ps (b, *(__m128*)&c7fffffff); - a0 = _mm_cmpgt_ps (a0, b0); - return (*(__m128i*)&a0); -} - -//***************Vector compare absolute less-than *********************** -//************************************************************************* -uint32x2_t vcalt_f32(float32x2_t a, float32x2_t b); // VACGT.F32 d0, d0, d0 -_NEON2SSE_INLINE uint32x2_t vcalt_f32(float32x2_t a, float32x2_t b) -{ - uint32x2_t res64; - __m128i c7fffffff; - __m128 a0, b0; - c7fffffff = _mm_set1_epi32 (0x7fffffff); - a0 = _mm_and_ps (_pM128(a), *(__m128*)&c7fffffff); - b0 = _mm_and_ps (_pM128(b), *(__m128*)&c7fffffff); - a0 = _mm_cmplt_ps (a0, b0); - return64f(a0); -} - -uint32x4_t vcaltq_f32(float32x4_t a, float32x4_t b); // VACGT.F32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vcaltq_f32(float32x4_t a, float32x4_t b) // VACGT.F32 q0, q0, q0 -{ - __m128i c7fffffff; - __m128 a0, b0; - c7fffffff = _mm_set1_epi32 (0x7fffffff); - a0 = _mm_and_ps (a, *(__m128*)&c7fffffff); - b0 = _mm_and_ps (b, *(__m128*)&c7fffffff); - a0 = _mm_cmplt_ps (a0, b0); - return (*(__m128i*)&a0); -} - -//*************************Vector test bits************************************ -//***************************************************************************** -/*VTST (Vector Test Bits) takes each element in a vector, and bitwise logical ANDs them -with the corresponding element of a second vector. If the result is not zero, the -corresponding element in the destination vector is set to all ones. Otherwise, it is set to -all zeros. */ - -uint8x8_t vtst_s8(int8x8_t a, int8x8_t b); // VTST.8 d0, d0, d0 -_NEON2SSE_INLINE uint8x8_t vtst_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(vtstq_s8(_pM128i(a), _pM128i(b))); -} - - -uint16x4_t vtst_s16(int16x4_t a, int16x4_t b); // VTST.16 d0, d0, d0 -_NEON2SSE_INLINE uint16x4_t vtst_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(vtstq_s16(_pM128i(a), _pM128i(b))); -} - - -uint32x2_t vtst_s32(int32x2_t a, int32x2_t b); // VTST.32 d0, d0, d0 -_NEON2SSE_INLINE uint32x2_t vtst_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(vtstq_s32(_pM128i(a), _pM128i(b))); -} - - -uint8x8_t vtst_u8(uint8x8_t a, uint8x8_t b); // VTST.8 d0, d0, d0 -#define vtst_u8 vtst_s8 - -uint16x4_t vtst_u16(uint16x4_t a, uint16x4_t b); // VTST.16 d0, d0, d0 -#define vtst_u16 vtst_s16 - -uint32x2_t vtst_u32(uint32x2_t a, uint32x2_t b); // VTST.32 d0, d0, d0 -#define vtst_u32 vtst_s32 - - -uint8x8_t vtst_p8(poly8x8_t a, poly8x8_t b); // VTST.8 d0, d0, d0 -#define vtst_p8 vtst_u8 - -uint8x16_t vtstq_s8(int8x16_t a, int8x16_t b); // VTST.8 q0, q0, q0 -_NEON2SSE_INLINE uint8x16_t vtstq_s8(int8x16_t a, int8x16_t b) // VTST.8 q0, q0, q0 -{ - __m128i zero, one, res; - zero = _mm_setzero_si128 (); - one = _mm_cmpeq_epi8(zero,zero); //0xfff..ffff - res = _mm_and_si128 (a, b); - res = _mm_cmpeq_epi8 (res, zero); - return _mm_xor_si128(res, one); //invert result -} - -uint16x8_t vtstq_s16(int16x8_t a, int16x8_t b); // VTST.16 q0, q0, q0 -_NEON2SSE_INLINE uint16x8_t vtstq_s16(int16x8_t a, int16x8_t b) // VTST.16 q0, q0, q0 -{ - __m128i zero, one, res; - zero = _mm_setzero_si128 (); - one = _mm_cmpeq_epi8(zero,zero); //0xfff..ffff - res = _mm_and_si128 (a, b); - res = _mm_cmpeq_epi16 (res, zero); - return _mm_xor_si128(res, one); //invert result -} - -uint32x4_t vtstq_s32(int32x4_t a, int32x4_t b); // VTST.32 q0, q0, q0 -_NEON2SSE_INLINE uint32x4_t vtstq_s32(int32x4_t a, int32x4_t b) // VTST.32 q0, q0, q0 -{ - __m128i zero, one, res; - zero = _mm_setzero_si128 (); - one = _mm_cmpeq_epi8(zero,zero); //0xfff..ffff - res = _mm_and_si128 (a, b); - res = _mm_cmpeq_epi32 (res, zero); - return _mm_xor_si128(res, one); //invert result -} - -uint8x16_t vtstq_u8(uint8x16_t a, uint8x16_t b); // VTST.8 q0, q0, q0 -#define vtstq_u8 vtstq_s8 - -uint16x8_t vtstq_u16(uint16x8_t a, uint16x8_t b); // VTST.16 q0, q0, q0 -#define vtstq_u16 vtstq_s16 - -uint32x4_t vtstq_u32(uint32x4_t a, uint32x4_t b); // VTST.32 q0, q0, q0 -#define vtstq_u32 vtstq_s32 - -uint8x16_t vtstq_p8(poly8x16_t a, poly8x16_t b); // VTST.8 q0, q0, q0 -#define vtstq_p8 vtstq_u8 - -//****************** Absolute difference ******************** -//*** Absolute difference between the arguments: Vr[i] = | Va[i] - Vb[i] |***** -//************************************************************ -int8x8_t vabd_s8(int8x8_t a, int8x8_t b); // VABD.S8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vabd_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(vabdq_s8(_pM128i(a), _pM128i(b))); -} - -int16x4_t vabd_s16(int16x4_t a, int16x4_t b); // VABD.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vabd_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(vabdq_s16(_pM128i(a), _pM128i(b))); -} - -int32x2_t vabd_s32(int32x2_t a, int32x2_t b); // VABD.S32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vabd_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(vabdq_s32(_pM128i(a), _pM128i(b))); -} - -uint8x8_t vabd_u8(uint8x8_t a, uint8x8_t b); // VABD.U8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vabd_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - return64(vabdq_u8(_pM128i(a), _pM128i(b))); -} - -uint16x4_t vabd_u16(uint16x4_t a, uint16x4_t b); // VABD.s16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vabd_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64(vabdq_u16(_pM128i(a), _pM128i(b))); -} - -uint32x2_t vabd_u32(uint32x2_t a, uint32x2_t b); // VABD.U32 d0,d0,d0 -_NEON2SSE_INLINE uint32x2_t vabd_u32(uint32x2_t a, uint32x2_t b) -{ - uint32x2_t res64; - return64(vabdq_u32(_pM128i(a), _pM128i(b))); -} - -float32x2_t vabd_f32(float32x2_t a, float32x2_t b); // VABD.F32 d0,d0,d0 -_NEON2SSE_INLINE float32x2_t vabd_f32(float32x2_t a, float32x2_t b) -{ - float32x4_t res; - __m64_128 res64; - res = vabdq_f32(_pM128(a), _pM128(b)); - _M64f(res64, res); - return res64; -} - -int8x16_t vabdq_s8(int8x16_t a, int8x16_t b); // VABD.S8 q0,q0,q0 -_NEON2SSE_INLINE int8x16_t vabdq_s8(int8x16_t a, int8x16_t b) // VABD.S8 q0,q0,q0 -{ - __m128i res; - res = _mm_sub_epi8 (a, b); - return _mm_abs_epi8 (res); -} - -int16x8_t vabdq_s16(int16x8_t a, int16x8_t b); // VABD.S16 q0,q0,q0 -_NEON2SSE_INLINE int16x8_t vabdq_s16(int16x8_t a, int16x8_t b) // VABD.S16 q0,q0,q0 -{ - __m128i res; - res = _mm_sub_epi16 (a,b); - return _mm_abs_epi16 (res); -} - -int32x4_t vabdq_s32(int32x4_t a, int32x4_t b); // VABD.S32 q0,q0,q0 -_NEON2SSE_INLINE int32x4_t vabdq_s32(int32x4_t a, int32x4_t b) // VABD.S32 q0,q0,q0 -{ - __m128i res; - res = _mm_sub_epi32 (a,b); - return _mm_abs_epi32 (res); -} - -uint8x16_t vabdq_u8(uint8x16_t a, uint8x16_t b); // VABD.U8 q0,q0,q0 -_NEON2SSE_INLINE uint8x16_t vabdq_u8(uint8x16_t a, uint8x16_t b) //no abs for unsigned -{ - __m128i cmp, difab, difba; - cmp = vcgtq_u8(a,b); - difab = _mm_sub_epi8(a,b); - difba = _mm_sub_epi8 (b,a); - difab = _mm_and_si128(cmp, difab); - difba = _mm_andnot_si128(cmp, difba); - return _mm_or_si128(difab, difba); -} - -uint16x8_t vabdq_u16(uint16x8_t a, uint16x8_t b); // VABD.s16 q0,q0,q0 -_NEON2SSE_INLINE uint16x8_t vabdq_u16(uint16x8_t a, uint16x8_t b) -{ - __m128i cmp, difab, difba; - cmp = vcgtq_u16(a,b); - difab = _mm_sub_epi16(a,b); - difba = _mm_sub_epi16 (b,a); - difab = _mm_and_si128(cmp, difab); - difba = _mm_andnot_si128(cmp, difba); - return _mm_or_si128(difab, difba); -} - -uint32x4_t vabdq_u32(uint32x4_t a, uint32x4_t b); // VABD.U32 q0,q0,q0 -_NEON2SSE_INLINE uint32x4_t vabdq_u32(uint32x4_t a, uint32x4_t b) -{ - __m128i cmp, difab, difba; - cmp = vcgtq_u32(a,b); - difab = _mm_sub_epi32(a,b); - difba = _mm_sub_epi32 (b,a); - difab = _mm_and_si128(cmp, difab); - difba = _mm_andnot_si128(cmp, difba); - return _mm_or_si128(difab, difba); -} - -float32x4_t vabdq_f32(float32x4_t a, float32x4_t b); // VABD.F32 q0,q0,q0 -_NEON2SSE_INLINE float32x4_t vabdq_f32(float32x4_t a, float32x4_t b) // VABD.F32 q0,q0,q0 -{ - __m128i c1; - __m128 res; - c1 = _mm_set1_epi32(0x7fffffff); - res = _mm_sub_ps (a, b); - return _mm_and_ps (res, *(__m128*)&c1); -} - -//************ Absolute difference - long ************************** -//******************************************************************** -int16x8_t vabdl_s8(int8x8_t a, int8x8_t b); // VABDL.S8 q0,d0,d0 -_NEON2SSE_INLINE int16x8_t vabdl_s8(int8x8_t a, int8x8_t b) // VABDL.S8 q0,d0,d0 -{ - __m128i a16, b16; - a16 = _MM_CVTEPI8_EPI16 (_pM128i(a)); //SSE4.1, - b16 = _MM_CVTEPI8_EPI16 (_pM128i(b)); //SSE4.1, - return vabdq_s16(a16, b16); - -} - -int32x4_t vabdl_s16(int16x4_t a, int16x4_t b); // VABDL.S16 q0,d0,d0 -_NEON2SSE_INLINE int32x4_t vabdl_s16(int16x4_t a, int16x4_t b) // VABDL.S16 q0,d0,d0 -{ - __m128i a32, b32; - a32 = _MM_CVTEPI16_EPI32 (_pM128i(a)); //SSE4.1 - b32 = _MM_CVTEPI16_EPI32 (_pM128i(b)); //SSE4.1, - return vabdq_s32(a32, b32); -} - -int64x2_t vabdl_s32(int32x2_t a, int32x2_t b); // VABDL.S32 q0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING (int64x2_t vabdl_s32(int32x2_t a, int32x2_t b),_NEON2SSE_REASON_SLOW_SERIAL) -{ - //no optimal SIMD solution, serial looks faster - _NEON2SSE_ALIGN_16 int64_t res[2]; - if(a.m64_i32[0] > b.m64_i32[0]) res[0] = ( int64_t) a.m64_i32[0] - ( int64_t) b.m64_i32[0]; - else res[0] = ( int64_t) b.m64_i32[0] - ( int64_t) a.m64_i32[0]; - if(a.m64_i32[1] > b.m64_i32[1]) res[1] = ( int64_t) a.m64_i32[1] - ( int64_t) b.m64_i32[1]; - else res[1] = ( int64_t) b.m64_i32[1] - ( int64_t) a.m64_i32[1]; - return _mm_load_si128((__m128i*)res); -} - -uint16x8_t vabdl_u8(uint8x8_t a, uint8x8_t b); // VABDL.U8 q0,d0,d0 -_NEON2SSE_INLINE uint16x8_t vabdl_u8(uint8x8_t a, uint8x8_t b) -{ - __m128i res; - res = vsubl_u8(a,b); - return _mm_abs_epi16(res); -} - -uint32x4_t vabdl_u16(uint16x4_t a, uint16x4_t b); // VABDL.s16 q0,d0,d0 -_NEON2SSE_INLINE uint32x4_t vabdl_u16(uint16x4_t a, uint16x4_t b) -{ - __m128i res; - res = vsubl_u16(a,b); - return _mm_abs_epi32(res); -} - -uint64x2_t vabdl_u32(uint32x2_t a, uint32x2_t b); // VABDL.U32 q0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING (uint64x2_t vabdl_u32(uint32x2_t a, uint32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - _NEON2SSE_ALIGN_16 uint64_t res[2]; - if(a.m64_u32[0] > b.m64_u32[0]) res[0] = ( uint64_t) a.m64_u32[0] - ( uint64_t) b.m64_u32[0]; - else res[0] = ( uint64_t) b.m64_u32[0] - ( uint64_t) a.m64_u32[0]; - if(a.m64_u32[1] > b.m64_u32[1]) res[1] = ( uint64_t) a.m64_u32[1] - ( uint64_t) b.m64_u32[1]; - else res[1] = ( uint64_t) b.m64_u32[1] - ( uint64_t) a.m64_u32[1]; - return _mm_load_si128((__m128i*)res); -} - -//**********Absolute difference and accumulate: Vr[i] = Va[i] + | Vb[i] - Vc[i] | ************* -//********************************************************************************************* -int8x8_t vaba_s8(int8x8_t a, int8x8_t b, int8x8_t c); // VABA.S8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vaba_s8(int8x8_t a, int8x8_t b, int8x8_t c) -{ - int8x8_t res64; - return64(vabaq_s8(_pM128i(a),_pM128i(b), _pM128i(c))); -} - -int16x4_t vaba_s16(int16x4_t a, int16x4_t b, int16x4_t c); // VABA.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vaba_s16(int16x4_t a, int16x4_t b, int16x4_t c) -{ - int16x4_t res64; - return64(vabaq_s16(_pM128i(a), _pM128i(b), _pM128i(c))); -} - -int32x2_t vaba_s32(int32x2_t a, int32x2_t b, int32x2_t c); // VABA.S32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vaba_s32(int32x2_t a, int32x2_t b, int32x2_t c) -{ - int32x2_t res64; - return64(vabaq_s32(_pM128i(a), _pM128i(b), _pM128i(c))); -} - -uint8x8_t vaba_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c); // VABA.U8 d0,d0,d0 -#define vaba_u8 vaba_s8 - - -uint16x4_t vaba_u16(uint16x4_t a, uint16x4_t b, uint16x4_t c); // VABA.s16 d0,d0,d0 -#define vaba_u16 vaba_s16 - -uint32x2_t vaba_u32(uint32x2_t a, uint32x2_t b, uint32x2_t c); // VABA.U32 d0,d0,d0 -_NEON2SSE_INLINE uint32x2_t vaba_u32(uint32x2_t a, uint32x2_t b, uint32x2_t c) -{ - uint32x2_t res64; - return64(vabaq_u32(_pM128i(a), _pM128i(b), _pM128i(c))); -} - -int8x16_t vabaq_s8(int8x16_t a, int8x16_t b, int8x16_t c); // VABA.S8 q0,q0,q0 -_NEON2SSE_INLINE int8x16_t vabaq_s8(int8x16_t a, int8x16_t b, int8x16_t c) // VABA.S8 q0,q0,q0 -{ - int8x16_t sub; - sub = vabdq_s8(b, c); - return vaddq_s8( a, sub); -} - -int16x8_t vabaq_s16(int16x8_t a, int16x8_t b, int16x8_t c); // VABA.S16 q0,q0,q0 -_NEON2SSE_INLINE int16x8_t vabaq_s16(int16x8_t a, int16x8_t b, int16x8_t c) // VABA.S16 q0,q0,q0 -{ - int16x8_t sub; - sub = vabdq_s16(b, c); - return vaddq_s16( a, sub); -} - -int32x4_t vabaq_s32(int32x4_t a, int32x4_t b, int32x4_t c); // VABA.S32 q0,q0,q0 -_NEON2SSE_INLINE int32x4_t vabaq_s32(int32x4_t a, int32x4_t b, int32x4_t c) // VABA.S32 q0,q0,q0 -{ - int32x4_t sub; - sub = vabdq_s32(b, c); - return vaddq_s32( a, sub); -} - -uint8x16_t vabaq_u8(uint8x16_t a, uint8x16_t b, uint8x16_t c); // VABA.U8 q0,q0,q0 -_NEON2SSE_INLINE uint8x16_t vabaq_u8(uint8x16_t a, uint8x16_t b, uint8x16_t c) -{ - uint8x16_t sub; - sub = vabdq_u8(b, c); - return vaddq_u8( a, sub); -} - -uint16x8_t vabaq_u16(uint16x8_t a, uint16x8_t b, uint16x8_t c); // VABA.s16 q0,q0,q0 -_NEON2SSE_INLINE uint16x8_t vabaq_u16(uint16x8_t a, uint16x8_t b, uint16x8_t c) -{ - uint16x8_t sub; - sub = vabdq_u16(b, c); - return vaddq_u16( a, sub); -} - -uint32x4_t vabaq_u32(uint32x4_t a, uint32x4_t b, uint32x4_t c); // VABA.U32 q0,q0,q0 -_NEON2SSE_INLINE uint32x4_t vabaq_u32(uint32x4_t a, uint32x4_t b, uint32x4_t c) -{ - uint32x4_t sub; - sub = vabdq_u32(b, c); - return vaddq_u32( a, sub); -} - -//************** Absolute difference and accumulate - long ******************************** -//************************************************************************************* -int16x8_t vabal_s8(int16x8_t a, int8x8_t b, int8x8_t c); // VABAL.S8 q0,d0,d0 -_NEON2SSE_INLINE int16x8_t vabal_s8(int16x8_t a, int8x8_t b, int8x8_t c) // VABAL.S8 q0,d0,d0 -{ - __m128i b16, c16, res; - b16 = _MM_CVTEPI8_EPI16 (_pM128i(b)); //SSE4.1, - c16 = _MM_CVTEPI8_EPI16 (_pM128i(c)); //SSE4.1, - res = _mm_abs_epi16 (_mm_sub_epi16 (b16, c16) ); - return _mm_add_epi16 (a, res); -} - -int32x4_t vabal_s16(int32x4_t a, int16x4_t b, int16x4_t c); // VABAL.S16 q0,d0,d0 -_NEON2SSE_INLINE int32x4_t vabal_s16(int32x4_t a, int16x4_t b, int16x4_t c) // VABAL.S16 q0,d0,d0 -{ - __m128i b32, c32, res; - b32 = _MM_CVTEPI16_EPI32(_pM128i(b)); //SSE4.1 - c32 = _MM_CVTEPI16_EPI32(_pM128i(c)); //SSE4.1 - res = _mm_abs_epi32 (_mm_sub_epi32 (b32, c32) ); - return _mm_add_epi32 (a, res); -} - -int64x2_t vabal_s32(int64x2_t a, int32x2_t b, int32x2_t c); // VABAL.S32 q0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING (int64x2_t vabal_s32(int64x2_t a, int32x2_t b, int32x2_t c), _NEON2SSE_REASON_SLOW_SERIAL) -{ - __m128i res; - res = vabdl_s32(b,c); - return _mm_add_epi64(a, res); -} - -uint16x8_t vabal_u8(uint16x8_t a, uint8x8_t b, uint8x8_t c); // VABAL.U8 q0,d0,d0 -_NEON2SSE_INLINE uint16x8_t vabal_u8(uint16x8_t a, uint8x8_t b, uint8x8_t c) -{ - __m128i b16, c16, res; - b16 = _MM_CVTEPU8_EPI16 (_pM128i(b)); //SSE4.1, - c16 = _MM_CVTEPU8_EPI16 (_pM128i(c)); //SSE4.1, - res = _mm_abs_epi16 (_mm_sub_epi16 (b16, c16) ); - return _mm_add_epi16 (a, res); -} - -uint32x4_t vabal_u16(uint32x4_t a, uint16x4_t b, uint16x4_t c); // VABAL.s16 q0,d0,d0 -_NEON2SSE_INLINE uint32x4_t vabal_u16(uint32x4_t a, uint16x4_t b, uint16x4_t c) -{ - __m128i b32, c32, res; - b32 = _MM_CVTEPU16_EPI32(_pM128i(b)); //SSE4.1 - c32 = _MM_CVTEPU16_EPI32(_pM128i(c)); //SSE4.1 - res = _mm_abs_epi32 (_mm_sub_epi32 (b32, c32) ); - return _mm_add_epi32 (a, res); -} - -uint64x2_t vabal_u32(uint64x2_t a, uint32x2_t b, uint32x2_t c); // VABAL.U32 q0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING (uint64x2_t vabal_u32(uint64x2_t a, uint32x2_t b, uint32x2_t c), _NEON2SSE_REASON_SLOW_SERIAL) -{ - __m128i res; - res = vabdl_u32(b,c); - return _mm_add_epi64(a, res); -} - -//*********************************************************************************** -//**************** Maximum and minimum operations ********************************** -//*********************************************************************************** -//************* Maximum: vmax -> Vr[i] := (Va[i] >= Vb[i]) ? Va[i] : Vb[i] ******* -//*********************************************************************************** -int8x8_t vmax_s8(int8x8_t a, int8x8_t b); // VMAX.S8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vmax_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - __m128i res; - res = _MM_MAX_EPI8(_pM128i(a),_pM128i(b)); //SSE4.1, use only lower 64 bits - return64(res); -} - -int16x4_t vmax_s16(int16x4_t a, int16x4_t b); // VMAX.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vmax_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(_mm_max_epi16(_pM128i(a),_pM128i(b))); -} - -int32x2_t vmax_s32(int32x2_t a, int32x2_t b); // VMAX.S32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vmax_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - __m128i res; - res = _MM_MAX_EPI32(_pM128i(a),_pM128i(b)); //SSE4.1, use only lower 64 bits - return64(res); -} - -uint8x8_t vmax_u8(uint8x8_t a, uint8x8_t b); // VMAX.U8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vmax_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - return64(_mm_max_epu8(_pM128i(a),_pM128i(b))); -} - - -uint16x4_t vmax_u16(uint16x4_t a, uint16x4_t b); // VMAX.s16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vmax_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64(_MM_MAX_EPU16(_pM128i(a),_pM128i(b))); -} - - -uint32x2_t vmax_u32(uint32x2_t a, uint32x2_t b); // VMAX.U32 d0,d0,d0 -_NEON2SSE_INLINE uint32x2_t vmax_u32(uint32x2_t a, uint32x2_t b) -{ - uint32x2_t res64; - __m128i res; - res = _MM_MAX_EPU32(_pM128i(a),_pM128i(b)); //SSE4.1, use only lower 64 bits, may be not effective compared with serial - return64(res); -} - -float32x2_t vmax_f32(float32x2_t a, float32x2_t b); // VMAX.F32 d0,d0,d0 -_NEON2SSE_INLINE float32x2_t vmax_f32(float32x2_t a, float32x2_t b) -{ - //serial solution looks faster than SIMD one - float32x2_t res; - res.m64_f32[0] = (a.m64_f32[0] > b.m64_f32[0]) ? a.m64_f32[0] : b.m64_f32[0]; - res.m64_f32[1] = (a.m64_f32[1] > b.m64_f32[1]) ? a.m64_f32[1] : b.m64_f32[1]; - return res; -} - -int8x16_t vmaxq_s8(int8x16_t a, int8x16_t b); // VMAX.S8 q0,q0,q0 -#define vmaxq_s8 _MM_MAX_EPI8 //SSE4.1 - -int16x8_t vmaxq_s16(int16x8_t a, int16x8_t b); // VMAX.S16 q0,q0,q0 -#define vmaxq_s16 _mm_max_epi16 - -int32x4_t vmaxq_s32(int32x4_t a, int32x4_t b); // VMAX.S32 q0,q0,q0 -#define vmaxq_s32 _MM_MAX_EPI32 //SSE4.1 - -uint8x16_t vmaxq_u8(uint8x16_t a, uint8x16_t b); // VMAX.U8 q0,q0,q0 -#define vmaxq_u8 _mm_max_epu8 - -uint16x8_t vmaxq_u16(uint16x8_t a, uint16x8_t b); // VMAX.s16 q0,q0,q0 -#define vmaxq_u16 _MM_MAX_EPU16 //SSE4.1 - -uint32x4_t vmaxq_u32(uint32x4_t a, uint32x4_t b); // VMAX.U32 q0,q0,q0 -#define vmaxq_u32 _MM_MAX_EPU32 //SSE4.1 - - -float32x4_t vmaxq_f32(float32x4_t a, float32x4_t b); // VMAX.F32 q0,q0,q0 -#define vmaxq_f32 _mm_max_ps - -//*************** Minimum: vmin -> Vr[i] := (Va[i] >= Vb[i]) ? Vb[i] : Va[i] ******************************** -//*********************************************************************************************************** -int8x8_t vmin_s8(int8x8_t a, int8x8_t b); // VMIN.S8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vmin_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - __m128i res; - res = _MM_MIN_EPI8(_pM128i(a),_pM128i(b)); //SSE4.1, use only lower 64 bits - return64(res); -} - -int16x4_t vmin_s16(int16x4_t a, int16x4_t b); // VMIN.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vmin_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(_mm_min_epi16(_pM128i(a),_pM128i(b))); -} - - -int32x2_t vmin_s32(int32x2_t a, int32x2_t b); // VMIN.S32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vmin_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - __m128i res; - res = _MM_MIN_EPI32(_pM128i(a),_pM128i(b)); //SSE4.1, use only lower 64 bits - return64(res); -} - -uint8x8_t vmin_u8(uint8x8_t a, uint8x8_t b); // VMIN.U8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vmin_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - return64(_mm_min_epu8(_pM128i(a),_pM128i(b))); -} - - -uint16x4_t vmin_u16(uint16x4_t a, uint16x4_t b); // VMIN.s16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vmin_u16(uint16x4_t a, uint16x4_t b) -{ - uint16x4_t res64; - return64(_MM_MIN_EPU16(_pM128i(a),_pM128i(b))); -} - - -uint32x2_t vmin_u32(uint32x2_t a, uint32x2_t b); // VMIN.U32 d0,d0,d0 -_NEON2SSE_INLINE uint32x2_t vmin_u32(uint32x2_t a, uint32x2_t b) -{ - uint32x2_t res64; - __m128i res; - res = _MM_MIN_EPU32(_pM128i(a),_pM128i(b)); //SSE4.1, use only lower 64 bits, may be not effective compared with serial - return64(res); -} - -float32x2_t vmin_f32(float32x2_t a, float32x2_t b); // VMIN.F32 d0,d0,d0 -_NEON2SSE_INLINE float32x2_t vmin_f32(float32x2_t a, float32x2_t b) -{ - //serial solution looks faster than SIMD one - float32x2_t res; - res.m64_f32[0] = (a.m64_f32[0] < b.m64_f32[0]) ? a.m64_f32[0] : b.m64_f32[0]; - res.m64_f32[1] = (a.m64_f32[1] < b.m64_f32[1]) ? a.m64_f32[1] : b.m64_f32[1]; - return res; -} - -int8x16_t vminq_s8(int8x16_t a, int8x16_t b); // VMIN.S8 q0,q0,q0 -#define vminq_s8 _MM_MIN_EPI8 //SSE4.1 - -int16x8_t vminq_s16(int16x8_t a, int16x8_t b); // VMIN.S16 q0,q0,q0 -#define vminq_s16 _mm_min_epi16 - -int32x4_t vminq_s32(int32x4_t a, int32x4_t b); // VMIN.S32 q0,q0,q0 -#define vminq_s32 _MM_MIN_EPI32 //SSE4.1 - -uint8x16_t vminq_u8(uint8x16_t a, uint8x16_t b); // VMIN.U8 q0,q0,q0 -#define vminq_u8 _mm_min_epu8 - -uint16x8_t vminq_u16(uint16x8_t a, uint16x8_t b); // VMIN.s16 q0,q0,q0 -#define vminq_u16 _MM_MIN_EPU16 //SSE4.1 - -uint32x4_t vminq_u32(uint32x4_t a, uint32x4_t b); // VMIN.U32 q0,q0,q0 -#define vminq_u32 _MM_MIN_EPU32 //SSE4.1 - -float32x4_t vminq_f32(float32x4_t a, float32x4_t b); // VMIN.F32 q0,q0,q0 -#define vminq_f32 _mm_min_ps - -//************* Pairwise addition operations. ************************************** -//************************************************************************************ -//Pairwise add - adds adjacent pairs of elements of two vectors, and places the results in the destination vector -int8x8_t vpadd_s8(int8x8_t a, int8x8_t b); // VPADD.I8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vpadd_s8(int8x8_t a, int8x8_t b) // VPADD.I8 d0,d0,d0 -{ - //no 8 bit hadd in IA32, need to go to 16 bit and then pack - int8x8_t res64; - __m128i a16, b16, res; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - a16 = _MM_CVTEPI8_EPI16 (_pM128i(a)); // SSE 4.1 - b16 = _MM_CVTEPI8_EPI16 (_pM128i(b)); // SSE 4.1 - res = _mm_hadd_epi16 (a16, b16); - res = _mm_shuffle_epi8 (res, *(__m128i*) mask8_16_even_odd); //return to 8 bit, use low 64 bits - return64(res); -} - -int16x4_t vpadd_s16(int16x4_t a, int16x4_t b); // VPADD.I16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vpadd_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - __m128i hadd128; - hadd128 = _mm_hadd_epi16 (_pM128i(a), _pM128i(b)); - hadd128 = _mm_shuffle_epi32 (hadd128, 0 | (2 << 2) | (1 << 4) | (3 << 6)); - return64(hadd128); -} - - -int32x2_t vpadd_s32(int32x2_t a, int32x2_t b); // VPADD.I32 d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vpadd_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - __m128i hadd128; - hadd128 = _mm_hadd_epi32 (_pM128i(a), _pM128i(b)); - hadd128 = _mm_shuffle_epi32 (hadd128, 0 | (2 << 2) | (1 << 4) | (3 << 6)); - return64(hadd128); -} - - -uint8x8_t vpadd_u8(uint8x8_t a, uint8x8_t b); // VPADD.I8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vpadd_u8(uint8x8_t a, uint8x8_t b) // VPADD.I8 d0,d0,d0 -{ - // no 8 bit hadd in IA32, need to go to 16 bit and then pack - uint8x8_t res64; -// no unsigned _mm_hadd_ functions in IA32, but 8 unsigned is less then 16 signed, so it should work - __m128i mask8, a16, b16, res; - mask8 = _mm_set1_epi16(0xff); - a16 = _MM_CVTEPU8_EPI16 (_pM128i(a)); // SSE 4.1 - b16 = _MM_CVTEPU8_EPI16 (_pM128i(b)); // SSE 4.1 - res = _mm_hadd_epi16 (a16, b16); - res = _mm_and_si128(res, mask8); //to avoid saturation - res = _mm_packus_epi16 (res,res); //use low 64 bits - return64(res); -} - -uint16x4_t vpadd_u16(uint16x4_t a, uint16x4_t b); // VPADD.I16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vpadd_u16(uint16x4_t a, uint16x4_t b) // VPADD.I16 d0,d0,d0 -{ - // solution may be not optimal, serial execution may be faster - // no unsigned _mm_hadd_ functions in IA32, need to move from unsigned to signed - uint16x4_t res64; - __m128i c32767, cfffe, as, bs, res; - c32767 = _mm_set1_epi16 (32767); - cfffe = _mm_set1_epi16 (0xfffe); - as = _mm_sub_epi16 (_pM128i(a), c32767); - bs = _mm_sub_epi16 (_pM128i(b), c32767); - res = _mm_hadd_epi16 (as, bs); - res = _mm_add_epi16 (res, cfffe); - res = _mm_shuffle_epi32 (res, 0 | (2 << 2) | (1 << 4) | (3 << 6)); - return64(res); -} - -uint32x2_t vpadd_u32(uint32x2_t a, uint32x2_t b); // VPADD.I32 d0,d0,d0 -_NEON2SSE_INLINE uint32x2_t vpadd_u32(uint32x2_t a, uint32x2_t b) //serial may be faster -{ - //hadd doesn't work for unsigned values - uint32x2_t res64; - __m128i ab, ab_sh, res; - ab = _mm_unpacklo_epi64 ( _pM128i(a), _pM128i(b)); //a0 a1 b0 b1 - ab_sh = _mm_shuffle_epi32(ab, 1 | (0 << 2) | (3 << 4) | (2 << 6)); //a1, a0, b1, b0 - res = _mm_add_epi32(ab, ab_sh); - res = _mm_shuffle_epi32(res, 0 | (2 << 2) | (1 << 4) | (3 << 6)); - return64(res); -} - -float32x2_t vpadd_f32(float32x2_t a, float32x2_t b); // VPADD.F32 d0,d0,d0 -_NEON2SSE_INLINE float32x2_t vpadd_f32(float32x2_t a, float32x2_t b) -{ - __m128 hadd128; - __m64_128 res64; - hadd128 = _mm_hadd_ps (_pM128(a), _pM128(b)); - hadd128 = _mm_shuffle_ps (hadd128, hadd128, _MM_SHUFFLE(3,1, 2, 0)); //use low 64 bits - _M64f(res64, hadd128); - return res64; -} - - -//************************** Long pairwise add ********************************** -//********************************************************************************* -//Adds adjacent pairs of elements of a vector,sign or zero extends the results to twice their original width, -// and places the final results in the destination vector. - -int16x4_t vpaddl_s8(int8x8_t a); // VPADDL.S8 d0,d0 -_NEON2SSE_INLINE int16x4_t vpaddl_s8(int8x8_t a) // VPADDL.S8 d0,d0 -{ - //no 8 bit hadd in IA32, need to go to 16 bit anyway - __m128i a16; - int16x4_t res64; - a16 = _MM_CVTEPI8_EPI16 (_pM128i(a)); // SSE 4.1 - a16 = _mm_hadd_epi16 (a16, a16); //use low 64 bits - return64(a16); -} - -int32x2_t vpaddl_s16(int16x4_t a); // VPADDL.S16 d0,d0 -_NEON2SSE_INLINE int32x2_t vpaddl_s16(int16x4_t a) // VPADDL.S16 d0,d0 -{ - // solution may be not optimal, serial execution may be faster - int32x2_t res64; - __m128i r32_1; - r32_1 = _MM_CVTEPI16_EPI32 (_pM128i(a)); - r32_1 = _mm_hadd_epi32(r32_1, r32_1); //use low 64 bits - return64(r32_1); -} - -int64x1_t vpaddl_s32(int32x2_t a); // VPADDL.S32 d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x1_t vpaddl_s32(int32x2_t a), _NEON2SSE_REASON_SLOW_SERIAL) //serial solution looks faster -{ - int64x1_t res; - res.m64_i64[0] = (int64_t)a.m64_i32[0] + (int64_t)a.m64_i32[1]; - return res; -} - -uint16x4_t vpaddl_u8(uint8x8_t a); // VPADDL.U8 d0,d0 -_NEON2SSE_INLINE uint16x4_t vpaddl_u8(uint8x8_t a) // VPADDL.U8 d0,d0 -{ - // no 8 bit hadd in IA32, need to go to 16 bit -// no unsigned _mm_hadd_ functions in IA32, but 8 unsigned is less then 16 signed, so it should work - uint16x4_t res64; - __m128i a16; - a16 = _MM_CVTEPU8_EPI16 (_pM128i(a)); // SSE 4.1 use low 64 bits - a16 = _mm_hadd_epi16 (a16, a16); //use low 64 bits - return64(a16); -} - -uint32x2_t vpaddl_u16(uint16x4_t a); // VPADDL.s16 d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vpaddl_u16(uint16x4_t a), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //serial solution looks faster than a SIMD one - uint32x2_t res; - res.m64_u32[0] = (uint32_t)a.m64_u16[0] + (uint32_t)a.m64_u16[1]; - res.m64_u32[1] = (uint32_t)a.m64_u16[2] + (uint32_t)a.m64_u16[3]; - return res; -} - -uint64x1_t vpaddl_u32(uint32x2_t a); // VPADDL.U32 d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x1_t vpaddl_u32(uint32x2_t a), _NEON2SSE_REASON_SLOW_SERIAL) //serial solution looks faster -{ - uint64x1_t res; - res.m64_u64[0] = (uint64_t)a.m64_u32[0] + (uint64_t)a.m64_u32[1]; - return res; -} - -int16x8_t vpaddlq_s8(int8x16_t a); // VPADDL.S8 q0,q0 -_NEON2SSE_INLINE int16x8_t vpaddlq_s8(int8x16_t a) // VPADDL.S8 q0,q0 -{ - //no 8 bit hadd in IA32, need to go to 16 bit - __m128i r16_1, r16_2; - r16_1 = _MM_CVTEPI8_EPI16 (a); // SSE 4.1 - //swap hi and low part of r to process the remaining data - r16_2 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - r16_2 = _MM_CVTEPI8_EPI16 (r16_2); - return _mm_hadd_epi16 (r16_1, r16_2); -} - -int32x4_t vpaddlq_s16(int16x8_t a); // VPADDL.S16 q0,q0 -_NEON2SSE_INLINE int32x4_t vpaddlq_s16(int16x8_t a) // VPADDL.S16 q0,q0 -{ - //no 8 bit hadd in IA32, need to go to 16 bit - __m128i r32_1, r32_2; - r32_1 = _MM_CVTEPI16_EPI32(a); - //swap hi and low part of r to process the remaining data - r32_2 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - r32_2 = _MM_CVTEPI16_EPI32 (r32_2); - return _mm_hadd_epi32 (r32_1, r32_2); -} - -int64x2_t vpaddlq_s32(int32x4_t a); // VPADDL.S32 q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vpaddlq_s32(int32x4_t a), _NEON2SSE_REASON_SLOW_SERIAL) // VPADDL.S32 q0,q0 -{ - _NEON2SSE_ALIGN_16 int32_t atmp[4]; - _NEON2SSE_ALIGN_16 int64_t res[2]; - _mm_store_si128((__m128i*)atmp, a); - res[0] = (int64_t)atmp[0] + (int64_t)atmp[1]; - res[1] = (int64_t)atmp[2] + (int64_t)atmp[3]; - return _mm_load_si128((__m128i*)res); -} - -uint16x8_t vpaddlq_u8(uint8x16_t a); // VPADDL.U8 q0,q0 -_NEON2SSE_INLINE uint16x8_t vpaddlq_u8(uint8x16_t a) // VPADDL.U8 q0,q0 -{ - //no 8 bit hadd in IA32, need to go to 16 bit - __m128i r16_1, r16_2; - r16_1 = _MM_CVTEPU8_EPI16(a); - //swap hi and low part of r to process the remaining data - r16_2 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - r16_2 = _MM_CVTEPU8_EPI16 (r16_2); - return _mm_hadd_epi16 (r16_1, r16_2); -} - -uint32x4_t vpaddlq_u16(uint16x8_t a); // VPADDL.s16 q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x4_t vpaddlq_u16(uint16x8_t a), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //serial solution looks faster than a SIMD one - _NEON2SSE_ALIGN_16 uint16_t atmp[8]; - _NEON2SSE_ALIGN_16 uint32_t res[4]; - _mm_store_si128((__m128i*)atmp, a); - res[0] = (uint32_t)atmp[0] + (uint32_t)atmp[1]; - res[1] = (uint32_t)atmp[2] + (uint32_t)atmp[3]; - res[2] = (uint32_t)atmp[4] + (uint32_t)atmp[5]; - res[3] = (uint32_t)atmp[6] + (uint32_t)atmp[7]; - return _mm_load_si128((__m128i*)res); -} - -uint64x2_t vpaddlq_u32(uint32x4_t a); // VPADDL.U32 q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x2_t vpaddlq_u32(uint32x4_t a), _NEON2SSE_REASON_SLOW_SERIAL) -{ - _NEON2SSE_ALIGN_16 uint32_t atmp[4]; - _NEON2SSE_ALIGN_16 uint64_t res[2]; - _mm_store_si128((__m128i*)atmp, a); - res[0] = (uint64_t)atmp[0] + (uint64_t)atmp[1]; - res[1] = (uint64_t)atmp[2] + (uint64_t)atmp[3]; - return _mm_load_si128((__m128i*)res); -} - -//************************ Long pairwise add and accumulate ************************** -//**************************************************************************************** -//VPADAL (Vector Pairwise Add and Accumulate Long) adds adjacent pairs of elements of a vector, -// and accumulates the values of the results into the elements of the destination (wide) vector -int16x4_t vpadal_s8(int16x4_t a, int8x8_t b); // VPADAL.S8 d0,d0 -_NEON2SSE_INLINE int16x4_t vpadal_s8(int16x4_t a, int8x8_t b) -{ - int16x4_t res64; - return64(vpadalq_s8(_pM128i(a), _pM128i(b))); -} - -int32x2_t vpadal_s16(int32x2_t a, int16x4_t b); // VPADAL.S16 d0,d0 -_NEON2SSE_INLINE int32x2_t vpadal_s16(int32x2_t a, int16x4_t b) -{ - int32x2_t res64; - return64(vpadalq_s16(_pM128i(a), _pM128i(b))); -} - - -int64x1_t vpadal_s32(int64x1_t a, int32x2_t b); // VPADAL.S32 d0,d0 -_NEON2SSE_INLINE int64x1_t vpadal_s32(int64x1_t a, int32x2_t b) -{ - int64x1_t res; - res.m64_i64[0] = (int64_t)b.m64_i32[0] + (int64_t)b.m64_i32[1] + a.m64_i64[0]; - return res; -} - -uint16x4_t vpadal_u8(uint16x4_t a, uint8x8_t b); // VPADAL.U8 d0,d0 -_NEON2SSE_INLINE uint16x4_t vpadal_u8(uint16x4_t a, uint8x8_t b) -{ - uint16x4_t res64; - return64(vpadalq_u8(_pM128i(a), _pM128i(b))); -} - - -uint32x2_t vpadal_u16(uint32x2_t a, uint16x4_t b); // VPADAL.s16 d0,d0 -_NEON2SSE_INLINE uint32x2_t vpadal_u16(uint32x2_t a, uint16x4_t b) -{ - uint32x2_t res64; - return64(vpadalq_u16(_pM128i(a), _pM128i(b))); -} - -uint64x1_t vpadal_u32(uint64x1_t a, uint32x2_t b); // VPADAL.U32 d0,d0 -_NEON2SSE_INLINE uint64x1_t vpadal_u32(uint64x1_t a, uint32x2_t b) -{ - uint64x1_t res; - res.m64_u64[0] = (uint64_t)b.m64_u32[0] + (uint64_t)b.m64_u32[1] + a.m64_u64[0]; - return res; -} - -int16x8_t vpadalq_s8(int16x8_t a, int8x16_t b); // VPADAL.S8 q0,q0 -_NEON2SSE_INLINE int16x8_t vpadalq_s8(int16x8_t a, int8x16_t b) // VPADAL.S8 q0,q0 -{ - int16x8_t pad; - pad = vpaddlq_s8(b); - return _mm_add_epi16 (a, pad); -} - -int32x4_t vpadalq_s16(int32x4_t a, int16x8_t b); // VPADAL.S16 q0,q0 -_NEON2SSE_INLINE int32x4_t vpadalq_s16(int32x4_t a, int16x8_t b) // VPADAL.S16 q0,q0 -{ - int32x4_t pad; - pad = vpaddlq_s16(b); - return _mm_add_epi32(a, pad); -} - -int64x2_t vpadalq_s32(int64x2_t a, int32x4_t b); // VPADAL.S32 q0,q0 -_NEON2SSE_INLINE int64x2_t vpadalq_s32(int64x2_t a, int32x4_t b) -{ - int64x2_t pad; - pad = vpaddlq_s32(b); - return _mm_add_epi64 (a, pad); -} - -uint16x8_t vpadalq_u8(uint16x8_t a, uint8x16_t b); // VPADAL.U8 q0,q0 -_NEON2SSE_INLINE uint16x8_t vpadalq_u8(uint16x8_t a, uint8x16_t b) // VPADAL.U8 q0,q0 -{ - uint16x8_t pad; - pad = vpaddlq_u8(b); - return _mm_add_epi16 (a, pad); -} - -uint32x4_t vpadalq_u16(uint32x4_t a, uint16x8_t b); // VPADAL.s16 q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x4_t vpadalq_u16(uint32x4_t a, uint16x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - uint32x4_t pad; - pad = vpaddlq_u16(b); - return _mm_add_epi32(a, pad); -} //no optimal SIMD solution, serial is faster - -uint64x2_t vpadalq_u32(uint64x2_t a, uint32x4_t b); // VPADAL.U32 q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x2_t vpadalq_u32(uint64x2_t a, uint32x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //no optimal SIMD solution, serial is faster - uint64x2_t pad; - pad = vpaddlq_u32(b); - return _mm_add_epi64(a, pad); -} //no optimal SIMD solution, serial is faster - -//********** Folding maximum ************************************* -//******************************************************************* -//VPMAX (Vector Pairwise Maximum) compares adjacent pairs of elements in two vectors, -//and copies the larger of each pair into the corresponding element in the destination -// no corresponding functionality in IA32 SIMD, so we need to do the vertical comparison -int8x8_t vpmax_s8(int8x8_t a, int8x8_t b); // VPMAX.S8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vpmax_s8(int8x8_t a, int8x8_t b) // VPMAX.S8 d0,d0,d0 -{ - int8x8_t res64; - __m128i ab, ab1, max; - _NEON2SSE_ALIGN_16 uint8_t mask8_sab[16] = { 1, 0, 3, 2, 5, 4, 7, 6, 9, 8, 11, 10, 13, 12, 15, 14}; - _NEON2SSE_ALIGN_16 uint8_t mask8_odd[16] = { 1, 3, 5, 7, 9, 11, 13, 15, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; - ab = _mm_unpacklo_epi64 ( _pM128i(a), _pM128i(b)); //ab - ab1 = _mm_shuffle_epi8 (ab, *(__m128i*) mask8_sab); //horisontal pairs swap for vertical max finding - max = _MM_MAX_EPI8 (ab, ab1); // SSE4.1 - max = _mm_shuffle_epi8 (max, *(__m128i*) mask8_odd); //remove repetitive data - return64(max); //we need 64 bits only -} - -int16x4_t vpmax_s16(int16x4_t a, int16x4_t b); // VPMAX.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vpmax_s16(int16x4_t a, int16x4_t b) // VPMAX.S16 d0,d0,d0 -{ - //solution may be not optimal compared with the serial one - int16x4_t res64; - __m128i ab, ab1, max; - _NEON2SSE_ALIGN_16 int8_t mask16_sab[16] = { 2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13}; //each chars pair is considerd to be 16 bit number - _NEON2SSE_ALIGN_16 int8_t mask16_odd[16] = { 0,1, 4,5, 8,9, 12,13, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; - ab = _mm_unpacklo_epi64 ( _pM128i(a), _pM128i(b)); //ab - ab1 = _mm_shuffle_epi8 (ab, *(__m128i*) mask16_sab); //horisontal pairs swap for vertical max finding, use 8bit fn and the corresponding mask - max = _mm_max_epi16 (ab, ab1); - max = _mm_shuffle_epi8 (max, *(__m128i*) mask16_odd); //remove repetitive data, use 8bit fn and the corresponding mask - return64(max); -} - -int32x2_t vpmax_s32(int32x2_t a, int32x2_t b); // VPMAX.S32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vpmax_s32(int32x2_t a, int32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //serial solution looks faster than SIMD one - int32x2_t res; - res.m64_i32[0] = (a.m64_i32[0] < a.m64_i32[1]) ? a.m64_i32[1] : a.m64_i32[0]; - res.m64_i32[1] = (b.m64_i32[0] < b.m64_i32[1]) ? b.m64_i32[1] : b.m64_i32[0]; - return res; -} - -uint8x8_t vpmax_u8(uint8x8_t a, uint8x8_t b); // VPMAX.U8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vpmax_u8(uint8x8_t a, uint8x8_t b) // VPMAX.U8 d0,d0,d0 -{ - uint8x8_t res64; - __m128i ab, ab1, max; - _NEON2SSE_ALIGN_16 int8_t mask8_sab[16] = { 1, 0, 3, 2, 5, 4, 7, 6, 9, 8, 11, 10, 13, 12, 15, 14}; - _NEON2SSE_ALIGN_16 uint8_t mask8_odd[16] = { 1, 3, 5, 7, 9, 11, 13, 15, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; - ab = _mm_unpacklo_epi64 (_pM128i(a), _pM128i(b)); //ab - ab1 = _mm_shuffle_epi8 (ab, *(__m128i*) mask8_sab); //horisontal pairs swap for vertical max finding - max = _mm_max_epu8 (ab, ab1); // SSE4.1 - max = _mm_shuffle_epi8 (max, *(__m128i*) mask8_odd); //remove repetitive data - return64(max); -} - -uint16x4_t vpmax_u16(uint16x4_t a, uint16x4_t b); // VPMAX.s16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vpmax_u16(uint16x4_t a, uint16x4_t b) // VPMAX.s16 d0,d0,d0 -{ - //solution may be not optimal compared with the serial one - uint16x4_t res64; - __m128i ab, ab1, max; - _NEON2SSE_ALIGN_16 uint8_t mask16_sab[16] = { 2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13}; //each chars pair is considerd to be 16 bit number - _NEON2SSE_ALIGN_16 uint8_t mask16_odd[16] = { 0,1, 4,5, 8,9, 12,13, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; - ab = _mm_unpacklo_epi64 ( _pM128i(a), _pM128i(b)); //ab - ab1 = _mm_shuffle_epi8 (ab, *(__m128i*) mask16_sab); //horisontal pairs swap for vertical max finding, use 8bit fn and the corresponding mask - max = _MM_MAX_EPU16 (ab, ab1); - max = _mm_shuffle_epi8 (max, *(__m128i*) mask16_odd); //remove repetitive data, use 8bit fn and the corresponding mask - return64(max); -} - -uint32x2_t vpmax_u32(uint32x2_t a, uint32x2_t b); // VPMAX.U32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vpmax_u32(uint32x2_t a, uint32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //serial solution looks faster than SIMD one - uint32x2_t res; - res.m64_i32[0] = (a.m64_i32[0] < a.m64_i32[1]) ? a.m64_i32[1] : a.m64_i32[0]; - res.m64_i32[1] = (b.m64_i32[0] < b.m64_i32[1]) ? b.m64_i32[1] : b.m64_i32[0]; - return res; -} //serial solution looks faster than a SIMD one - -float32x2_t vpmax_f32(float32x2_t a, float32x2_t b); // VPMAX.F32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(float32x2_t vpmax_f32(float32x2_t a, float32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //serial solution looks faster than SIMD one - float32x2_t res; - res.m64_f32[0] = (a.m64_f32[0] < a.m64_f32[1]) ? a.m64_f32[1] : a.m64_f32[0]; - res.m64_f32[1] = (b.m64_f32[0] < b.m64_f32[1]) ? b.m64_f32[1] : b.m64_f32[0]; - return res; -} - -// ***************** Folding minimum **************************** -// ************************************************************** -//vpmin -> takes minimum of adjacent pairs -int8x8_t vpmin_s8(int8x8_t a, int8x8_t b); // VPMIN.S8 d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vpmin_s8(int8x8_t a, int8x8_t b) // VPMIN.S8 d0,d0,d0 -{ - int8x8_t res64; - __m128i ab, ab1, min; - _NEON2SSE_ALIGN_16 uint8_t mask8_sab[16] = { 1, 0, 3, 2, 5, 4, 7, 6, 9, 8, 11, 10, 13, 12, 15, 14}; - _NEON2SSE_ALIGN_16 uint8_t mask8_odd[16] = { 1, 3, 5, 7, 9, 11, 13, 15, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; - ab = _mm_unpacklo_epi64 ( _pM128i(a), _pM128i(b)); //ab - ab1 = _mm_shuffle_epi8 (ab, *(__m128i*) mask8_sab); //horisontal pairs swap for vertical min finding - min = _MM_MIN_EPI8 (ab, ab1); // SSE4.1 - min = _mm_shuffle_epi8 (min, *(__m128i*) mask8_odd); //remove repetitive data - return64(min); -} - -int16x4_t vpmin_s16(int16x4_t a, int16x4_t b); // VPMIN.S16 d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vpmin_s16(int16x4_t a, int16x4_t b) // VPMIN.S16 d0,d0,d0 -{ - //solution may be not optimal compared with the serial one - int16x4_t res64; - __m128i ab, ab1, min; - _NEON2SSE_ALIGN_16 int8_t mask16_sab[16] = { 2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13}; //each chars pair is considerd to be 16 bit number - _NEON2SSE_ALIGN_16 int8_t mask16_odd[16] = { 0,1, 4,5, 8,9, 12,13, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; - ab = _mm_unpacklo_epi64 ( _pM128i(a), _pM128i(b)); //ab - ab1 = _mm_shuffle_epi8 (ab, *(__m128i*) mask16_sab); //horisontal pairs swap for vertical max finding, use 8bit fn and the corresponding mask - min = _mm_min_epi16 (ab, ab1); - min = _mm_shuffle_epi8 (min, *(__m128i*) mask16_odd); //remove repetitive data, use 8bit fn and the corresponding mask - return64(min); -} - -int32x2_t vpmin_s32(int32x2_t a, int32x2_t b); // VPMIN.S32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vpmin_s32(int32x2_t a, int32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //serial solution looks faster than SIMD one - int32x2_t res; - res.m64_i32[0] = (a.m64_i32[0] > a.m64_i32[1]) ? a.m64_i32[1] : a.m64_i32[0]; - res.m64_i32[1] = (b.m64_i32[0] > b.m64_i32[1]) ? b.m64_i32[1] : b.m64_i32[0]; - return res; -} - -uint8x8_t vpmin_u8(uint8x8_t a, uint8x8_t b); // VPMIN.U8 d0,d0,d0 -_NEON2SSE_INLINE uint8x8_t vpmin_u8(uint8x8_t a, uint8x8_t b) // VPMIN.U8 d0,d0,d0 -{ - uint8x8_t res64; - __m128i ab, ab1, min; - _NEON2SSE_ALIGN_16 uint8_t mask8_sab[16] = { 1, 0, 3, 2, 5, 4, 7, 6, 9, 8, 11, 10, 13, 12, 15, 14}; - _NEON2SSE_ALIGN_16 uint8_t mask8_odd[16] = { 1, 3, 5, 7, 9, 11, 13, 15, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; - ab = _mm_unpacklo_epi64 ( _pM128i(a), _pM128i(b)); //ab - ab1 = _mm_shuffle_epi8 (ab, *(__m128i*) mask8_sab); //horisontal pairs swap for vertical max finding - min = _mm_min_epu8 (ab, ab1); // SSE4.1 - min = _mm_shuffle_epi8 (min, *(__m128i*) mask8_odd); //remove repetitive data - return64(min); -} - -uint16x4_t vpmin_u16(uint16x4_t a, uint16x4_t b); // VPMIN.s16 d0,d0,d0 -_NEON2SSE_INLINE uint16x4_t vpmin_u16(uint16x4_t a, uint16x4_t b) // VPMIN.s16 d0,d0,d0 -{ - //solution may be not optimal compared with the serial one - uint16x4_t res64; - __m128i ab, ab1, min; - _NEON2SSE_ALIGN_16 uint8_t mask16_sab[16] = { 2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13}; //each chars pair is considerd to be 16 bit number - _NEON2SSE_ALIGN_16 uint8_t mask16_odd[16] = { 0,1, 4,5, 8,9, 12,13, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; - ab = _mm_unpacklo_epi64 ( _pM128i(a), _pM128i(b)); //ab - ab1 = _mm_shuffle_epi8 (ab, *(__m128i*) mask16_sab); //horisontal pairs swap for vertical min finding, use 8bit fn and the corresponding mask - min = _MM_MIN_EPU16 (ab, ab1); - min = _mm_shuffle_epi8 (min, *(__m128i*) mask16_odd); //remove repetitive data, use 8bit fn and the corresponding mask - return64(min); -} - -uint32x2_t vpmin_u32(uint32x2_t a, uint32x2_t b); // VPMIN.U32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vpmin_u32(uint32x2_t a, uint32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //serial solution looks faster than SIMD one - uint32x2_t res; - res.m64_u32[0] = (a.m64_u32[0] > a.m64_u32[1]) ? a.m64_u32[1] : a.m64_u32[0]; - res.m64_u32[1] = (b.m64_u32[0] > b.m64_u32[1]) ? b.m64_u32[1] : b.m64_u32[0]; - return res; -} - -float32x2_t vpmin_f32(float32x2_t a, float32x2_t b); // VPMIN.F32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(float32x2_t vpmin_f32(float32x2_t a, float32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //serial solution looks faster than SIMD one - float32x2_t res; - res.m64_f32[0] = (a.m64_f32[0] > a.m64_f32[1]) ? a.m64_f32[1] : a.m64_f32[0]; - res.m64_f32[1] = (b.m64_f32[0] > b.m64_f32[1]) ? b.m64_f32[1] : b.m64_f32[0]; - return res; -} - -//*************************************************************** -//*********** Reciprocal/Sqrt ************************************ -//*************************************************************** -//****************** Reciprocal estimate ******************************* -//the ARM NEON and x86 SIMD results may be slightly different -float32x2_t vrecpe_f32(float32x2_t a); // VRECPE.F32 d0,d0 -_NEON2SSE_INLINE float32x2_t vrecpe_f32(float32x2_t a) //use low 64 bits -{ - float32x4_t res; - __m64_128 res64; - res = _mm_rcp_ps(_pM128(a)); - _M64f(res64, res); - return res64; -} - -uint32x2_t vrecpe_u32(uint32x2_t a); // VRECPE.U32 d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vrecpe_u32(uint32x2_t a), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //Input is fixed point number!!! No reciprocal for ints in IA32 available - uint32x2_t res; - float resf, r; - int i, q, s; - for (i =0; i<2; i++){ - if((a.m64_u32[i] & 0x80000000) == 0) { - res.m64_u32[i] = 0xffffffff; - }else{ - resf = (float) (a.m64_u32[i] * (0.5f / (uint32_t)(1 << 31))); - q = (int)(resf * 512.0); /* a in units of 1/512 rounded down */ - r = 1.0 / (((float)q + 0.5) / 512.0); /* reciprocal r */ - s = (int)(256.0 * r + 0.5); /* r in units of 1/256 rounded to nearest */ - r = (float)s / 256.0; - res.m64_u32[i] = r * (uint32_t)(1 << 31); - } - } - return res; -} - -float32x4_t vrecpeq_f32(float32x4_t a); // VRECPE.F32 q0,q0 -#define vrecpeq_f32 _mm_rcp_ps - - -uint32x4_t vrecpeq_u32(uint32x4_t a); // VRECPE.U32 q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x4_t vrecpeq_u32(uint32x4_t a), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //Input is fixed point number!!! - //We implement the recip_estimate function as described in ARMv7 reference manual (VRECPE instruction) but use float instead of double - _NEON2SSE_ALIGN_16 uint32_t atmp[4]; - _NEON2SSE_ALIGN_16 uint32_t res[4]; - _NEON2SSE_ALIGN_16 int c80000000[4] = {0x80000000,0x80000000, 0x80000000,0x80000000}; - float resf, r; - int i, q, s; - __m128i res128, mask, zero; - _mm_store_si128((__m128i*)atmp, a); - zero = _mm_setzero_si128(); - for (i =0; i<4; i++){ - resf = (atmp[i] * (0.5f / (uint32_t) (1 << 31))); // 2.3283064365386963E-10 ~(0.5f / (uint32_t) (1 << 31)) - q = (int)(resf * 512.0); /* a in units of 1/512 rounded down */ - r = 1.0 / (((float)q + 0.5) / 512.0); /* reciprocal r */ - s = (int)(256.0 * r + 0.5); /* r in units of 1/256 rounded to nearest */ - r = (float)s / 256.0; - res[i] = (uint32_t) (r * (((uint32_t)1) << 31) ); - } - res128 = _mm_load_si128((__m128i*)res); - mask = _mm_and_si128(a, *(__m128i*)c80000000); - mask = _mm_cmpeq_epi32(zero, mask); //0xffffffff if atmp[i] <= 0x7fffffff - return _mm_or_si128(res128, mask); -} - -//**********Reciprocal square root estimate **************** -//********************************************************** -//no reciprocal square root for ints in IA32 available, neither for unsigned int to float4 lanes conversion, so a serial solution looks faster -//but the particular implementation for vrsqrte_u32 may vary for various ARM compilers -////the ARM NEON and x86 SIMD results may be slightly different -float32x2_t vrsqrte_f32(float32x2_t a); // VRSQRTE.F32 d0,d0 -_NEON2SSE_INLINE float32x2_t vrsqrte_f32(float32x2_t a) //use low 64 bits -{ - float32x4_t res; - __m64_128 res64; - res = _mm_rsqrt_ps(_pM128(a)); - _M64f(res64, res); - return res64; -} - -uint32x2_t vrsqrte_u32(uint32x2_t a); // VRSQRTE.U32 d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vrsqrte_u32(uint32x2_t a), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //Input is fixed point number!!! - //We implement the recip_sqrt_estimate function as described in ARMv7 reference manual (VRSQRTE instruction) but use float instead of double - uint32x2_t res; - __m128 tmp; - float r, resf, coeff; - int i,q0, q1, s;; - for (i =0; i<2; i++){ - if((a.m64_u32[i] & 0xc0000000) == 0) { //a <=0x3fffffff - res.m64_u32[i] = 0xffffffff; - }else{ - resf = (float) (a.m64_u32[i] * (0.5f / (uint32_t)(1 << 31))); - coeff = (resf < 0.5)? 512.0 : 256.0 ; /* range 0.25 <= resf < 0.5 or range 0.5 <= resf < 1.0*/ - q0 = (int)(resf * coeff); /* a in units of 1/512 rounded down */ - r = ((float)q0 + 0.5) / coeff; - tmp = _mm_rsqrt_ss(_mm_load_ss( &r));/* reciprocal root r */ - _mm_store_ss(&r, tmp); - s = (int)(256.0 * r + 0.5); /* r in units of 1/256 rounded to nearest */ - r = (float)s / 256.0; - res.m64_u32[i] = r * (((uint32_t)1) << 31); - } - } - return res; -} - -float32x4_t vrsqrteq_f32(float32x4_t a); // VRSQRTE.F32 q0,q0 -#define vrsqrteq_f32 _mm_rsqrt_ps - -uint32x4_t vrsqrteq_u32(uint32x4_t a); // VRSQRTE.U32 q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x4_t vrsqrteq_u32(uint32x4_t a), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //Input is fixed point number!!! - //We implement the recip_sqrt_estimate function as described in ARMv7 reference manual (VRSQRTE instruction) but use float instead of double - _NEON2SSE_ALIGN_16 uint32_t atmp[4], res[4]; - _NEON2SSE_ALIGN_16 float c1_31[4] = {(float)(((uint32_t)1) << 31), (float)(((uint32_t)1) << 31),(float)(((uint32_t)1) << 31), (float)(((uint32_t)1) << 31)}; - _NEON2SSE_ALIGN_16 int c_c0000000[4] = {0xc0000000,0xc0000000, 0xc0000000,0xc0000000}; - __m128 tmp; - __m128i res128, mask, zero; - float r, resf, coeff; - int i,q0, q1, s; - _mm_store_si128((__m128i*)atmp, a); - zero = _mm_setzero_si128(); - for (i =0; i<4; i++){ - resf = (float) (atmp[i] * (0.5f / (uint32_t)(1 << 31))); - coeff = (resf < 0.5)? 512.0 : 256.0 ; /* range 0.25 <= resf < 0.5 or range 0.5 <= resf < 1.0*/ - q0 = (int)(resf * coeff); /* a in units of 1/512 rounded down */ - r = ((float)q0 + 0.5) / coeff; - tmp = _mm_rsqrt_ss(_mm_load_ss( &r));/* reciprocal root r */ - _mm_store_ss(&r, tmp); - s = (int)(256.0 * r + 0.5); /* r in units of 1/256 rounded to nearest */ - r = (float)s / 256.0; - res[i] = (uint32_t) (r * (((uint32_t)1) << 31) ); - } - res128 = _mm_load_si128((__m128i*)res); - mask = _mm_and_si128(a, *(__m128i*)c_c0000000); - mask = _mm_cmpeq_epi32(zero, mask); //0xffffffff if atmp[i] <= 0x3fffffff - return _mm_or_si128(res128, mask); -} -//************ Reciprocal estimate/step and 1/sqrt estimate/step *************************** -//****************************************************************************************** -//******VRECPS (Vector Reciprocal Step) *************************************************** -//multiplies the elements of one vector by the corresponding elements of another vector, -//subtracts each of the results from 2, and places the final results into the elements of the destination vector. - -float32x2_t vrecps_f32(float32x2_t a, float32x2_t b); // VRECPS.F32 d0, d0, d0 -_NEON2SSE_INLINE float32x2_t vrecps_f32(float32x2_t a, float32x2_t b) -{ - float32x4_t res; - __m64_128 res64; - res = vrecpsq_f32(_pM128(a), _pM128(b)); - _M64f(res64, res); - return res64; -} - -float32x4_t vrecpsq_f32(float32x4_t a, float32x4_t b); // VRECPS.F32 q0, q0, q0 -_NEON2SSE_INLINE float32x4_t vrecpsq_f32(float32x4_t a, float32x4_t b) // VRECPS.F32 q0, q0, q0 -{ - __m128 f2, mul; - f2 = _mm_set1_ps(2.); - mul = _mm_mul_ps(a,b); - return _mm_sub_ps(f2,mul); -} - -//*****************VRSQRTS (Vector Reciprocal Square Root Step) ***************************** -//multiplies the elements of one vector by the corresponding elements of another vector, -//subtracts each of the results from 3, divides these results by two, and places the final results into the elements of the destination vector. - -float32x2_t vrsqrts_f32(float32x2_t a, float32x2_t b); // VRSQRTS.F32 d0, d0, d0 -_NEON2SSE_INLINE float32x2_t vrsqrts_f32(float32x2_t a, float32x2_t b) -{ - float32x2_t res; - res.m64_f32[0] = (3 - a.m64_f32[0] * b.m64_f32[0]) / 2; - res.m64_f32[1] = (3 - a.m64_f32[1] * b.m64_f32[1]) / 2; - return res; -} - -float32x4_t vrsqrtsq_f32(float32x4_t a, float32x4_t b); // VRSQRTS.F32 q0, q0, q0 -_NEON2SSE_INLINE float32x4_t vrsqrtsq_f32(float32x4_t a, float32x4_t b) // VRSQRTS.F32 q0, q0, q0 -{ - __m128 f3, f05, mul; - f3 = _mm_set1_ps(3.); - f05 = _mm_set1_ps(0.5); - mul = _mm_mul_ps(a,b); - f3 = _mm_sub_ps(f3,mul); - return _mm_mul_ps (f3, f05); -} -//******************************************************************************************** -//***************************** Shifts by signed variable *********************************** -//******************************************************************************************** -//***** Vector shift left: Vr[i] := Va[i] << Vb[i] (negative values shift right) *********************** -//******************************************************************************************** -//No such operations in IA32 SIMD unfortunately, constant shift only available, so need to do the serial solution -//helper macro. It matches ARM implementation for big shifts -#define SERIAL_SHIFT(TYPE, INTERNAL_TYPE, LENMAX, LEN) \ - _NEON2SSE_ALIGN_16 TYPE atmp[LENMAX], res[LENMAX]; _NEON2SSE_ALIGN_16 INTERNAL_TYPE btmp[LENMAX]; int i, lanesize = sizeof(INTERNAL_TYPE) << 3; \ - _mm_store_si128((__m128i*)atmp, a); _mm_store_si128((__m128i*)btmp, b); \ - for (i = 0; i<LEN; i++) { \ - if( (btmp[i] >= lanesize)||(btmp[i] <= -lanesize) ) res[i] = 0; \ - else res[i] = (btmp[i] >=0) ? atmp[i] << btmp[i] : atmp[i] >> (-btmp[i]); } \ - return _mm_load_si128((__m128i*)res); - -#define SERIAL_SHIFT_64(TYPE, SIGN, LEN) \ - int ## TYPE ## x ## LEN ## _t res; int i, lanesize = sizeof(int ## TYPE ## _t) << 3; \ - for (i = 0; i<LEN; i++) { \ - if( (b.m64_i ## TYPE[i] >= lanesize)||(b.m64_i ## TYPE[i] <= -lanesize) ) res.m64_ ## SIGN ## TYPE[i] = 0; \ - else res.m64_ ## SIGN ## TYPE[i] = (b.m64_i ## TYPE[i] >=0) ? a.m64_ ## SIGN ## TYPE[i] << b.m64_i ## TYPE[i] : a.m64_ ## SIGN ## TYPE[i] >> (-b.m64_i ## TYPE[i]); } \ - return res; - -int8x8_t vshl_s8(int8x8_t a, int8x8_t b); // VSHL.S8 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int8x8_t vshl_s8(int8x8_t a, int8x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT_64(8, i, 8) -} - -int16x4_t vshl_s16(int16x4_t a, int16x4_t b); // VSHL.S16 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int16x4_t vshl_s16(int16x4_t a, int16x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT_64(16, i, 4) -} - -int32x2_t vshl_s32(int32x2_t a, int32x2_t b); // VSHL.S32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vshl_s32(int32x2_t a, int32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT_64(32, i, 2) -} - -int64x1_t vshl_s64(int64x1_t a, int64x1_t b); // VSHL.S64 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x1_t vshl_s64(int64x1_t a, int64x1_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT_64(64, i, 1) -} - -uint8x8_t vshl_u8(uint8x8_t a, int8x8_t b); // VSHL.U8 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint8x8_t vshl_u8(uint8x8_t a, int8x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT_64(8, u, 8) -} - -uint16x4_t vshl_u16(uint16x4_t a, int16x4_t b); // VSHL.s16 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint16x4_t vshl_u16(uint16x4_t a, int16x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT_64(16, u, 4) -} - -uint32x2_t vshl_u32(uint32x2_t a, int32x2_t b); // VSHL.U32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vshl_u32(uint32x2_t a, int32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT_64(32, u, 2) -} - -uint64x1_t vshl_u64(uint64x1_t a, int64x1_t b); // VSHL.U64 d0,d0,d0 -_NEON2SSE_INLINE uint64x1_t vshl_u64(uint64x1_t a, int64x1_t b) //if we use the SERIAL_SHIFT macro need to have the special processing for large numbers -{ - SERIAL_SHIFT_64(64, u, 1) -} - -int8x16_t vshlq_s8(int8x16_t a, int8x16_t b); // VSHL.S8 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int8x16_t vshlq_s8(int8x16_t a, int8x16_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT(int8_t, int8_t, 16, 16) -} - -int16x8_t vshlq_s16(int16x8_t a, int16x8_t b); // VSHL.S16 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int16x8_t vshlq_s16(int16x8_t a, int16x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT(int16_t, int16_t, 8, 8) -} - -int32x4_t vshlq_s32(int32x4_t a, int32x4_t b); // VSHL.S32 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x4_t vshlq_s32(int32x4_t a, int32x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT(int32_t, int32_t, 4, 4) -} - -int64x2_t vshlq_s64(int64x2_t a, int64x2_t b); // VSHL.S64 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vshlq_s64(int64x2_t a, int64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT(int64_t, int64_t, 2, 2) -} - -uint8x16_t vshlq_u8(uint8x16_t a, int8x16_t b); // VSHL.U8 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint8x16_t vshlq_u8(uint8x16_t a, int8x16_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT(uint8_t, int8_t, 16, 16) -} - -uint16x8_t vshlq_u16(uint16x8_t a, int16x8_t b); // VSHL.s16 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint16x8_t vshlq_u16(uint16x8_t a, int16x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT(uint16_t, int16_t, 8, 8) -} - -uint32x4_t vshlq_u32(uint32x4_t a, int32x4_t b); // VSHL.U32 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x4_t vshlq_u32(uint32x4_t a, int32x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT(uint32_t, int32_t, 4, 4) -} - -uint64x2_t vshlq_u64(uint64x2_t a, int64x2_t b); // VSHL.U64 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING( uint64x2_t vshlq_u64(uint64x2_t a, int64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SHIFT(uint64_t, int64_t, 2, 2) -} - - -//*********** Vector saturating shift left: (negative values shift right) ********************** -//******************************************************************************************** -//No such operations in IA32 SIMD available yet, constant shift only available, so need to do the serial solution -#define SERIAL_SATURATING_SHIFT_SIGNED(TYPE, LENMAX, LEN) \ - _NEON2SSE_ALIGN_16 TYPE atmp[LENMAX], res[LENMAX], btmp[LENMAX]; TYPE limit; int i; \ - int lanesize_1 = (sizeof(TYPE) << 3) - 1; \ - _mm_store_si128((__m128i*)atmp, a); _mm_store_si128((__m128i*)btmp, b); \ - for (i = 0; i<LEN; i++) { \ - if (atmp[i] ==0) res[i] = 0; \ - else{ \ - if(btmp[i] <0) res[i] = atmp[i] >> (-btmp[i]); \ - else{ \ - if (btmp[i]>lanesize_1) { \ - res[i] = ((_UNSIGNED_T(TYPE))atmp[i] >> lanesize_1 ) + ((TYPE)1 << lanesize_1) - 1; \ - }else{ \ - limit = (TYPE)1 << (lanesize_1 - btmp[i]); \ - if((atmp[i] >= limit)||(atmp[i] <= -limit)) \ - res[i] = ((_UNSIGNED_T(TYPE))atmp[i] >> lanesize_1 ) + ((TYPE)1 << lanesize_1) - 1; \ - else res[i] = atmp[i] << btmp[i]; }}}} \ - return _mm_load_si128((__m128i*)res); - -#define SERIAL_SATURATING_SHIFT_UNSIGNED(TYPE, LENMAX, LEN) \ - _NEON2SSE_ALIGN_16 _UNSIGNED_T(TYPE) atmp[LENMAX], res[LENMAX]; _NEON2SSE_ALIGN_16 TYPE btmp[LENMAX]; _UNSIGNED_T(TYPE) limit; int i; \ - TYPE lanesize = (sizeof(TYPE) << 3); \ - _mm_store_si128((__m128i*)atmp, a); _mm_store_si128((__m128i*)btmp, b); \ - for (i = 0; i<LEN; i++) { \ - if (atmp[i] ==0) {res[i] = 0; \ - }else{ \ - if(btmp[i] < 0) res[i] = atmp[i] >> (-btmp[i]); \ - else{ \ - if (btmp[i]>lanesize) res[i] = ~((TYPE)0); \ - else{ \ - limit = (TYPE) 1 << (lanesize - btmp[i]); \ - res[i] = ( atmp[i] >= limit) ? res[i] = ~((TYPE)0) : atmp[i] << btmp[i]; }}}} \ - return _mm_load_si128((__m128i*)res); - -#define SERIAL_SATURATING_SHIFT_SIGNED_64(TYPE, LEN) \ - int ## TYPE ## x ## LEN ## _t res; int ## TYPE ## _t limit; int i; \ - int lanesize_1 = (sizeof( int ## TYPE ## _t) << 3) - 1; \ - for (i = 0; i<LEN; i++) { \ - if (a.m64_i ## TYPE[i] ==0) res.m64_i ## TYPE[i] = 0; \ - else{ \ - if(b.m64_i ## TYPE[i] <0) res.m64_i ## TYPE[i] = a.m64_i ## TYPE[i] >> (-(b.m64_i ## TYPE[i])); \ - else{ \ - if (b.m64_i ## TYPE[i]>lanesize_1) { \ - res.m64_i ## TYPE[i] = ((_UNSIGNED_T(int ## TYPE ## _t))a.m64_i ## TYPE[i] >> lanesize_1 ) + ((int ## TYPE ## _t) 1 << lanesize_1) - 1; \ - }else{ \ - limit = (int ## TYPE ## _t) 1 << (lanesize_1 - b.m64_i ## TYPE[i]); \ - if((a.m64_i ## TYPE[i] >= limit)||(a.m64_i ## TYPE[i] <= -limit)) \ - res.m64_i ## TYPE[i] = ((_UNSIGNED_T(int ## TYPE ## _t))a.m64_i ## TYPE[i] >> lanesize_1 ) + ((int ## TYPE ## _t) 1 << lanesize_1) - 1; \ - else res.m64_i ## TYPE[i] = a.m64_i ## TYPE[i] << b.m64_i ## TYPE[i]; }}}} \ - return res; - -#define SERIAL_SATURATING_SHIFT_UNSIGNED_64(TYPE, LEN) \ - int ## TYPE ## x ## LEN ## _t res; _UNSIGNED_T(int ## TYPE ## _t) limit; int i; \ - int ## TYPE ## _t lanesize = (sizeof(int ## TYPE ## _t) << 3); \ - for (i = 0; i<LEN; i++) { \ - if (a.m64_u ## TYPE[i] ==0) {res.m64_u ## TYPE[i] = 0; \ - }else{ \ - if(b.m64_i ## TYPE[i] < 0) res.m64_u ## TYPE[i] = a.m64_u ## TYPE[i] >> (-(b.m64_i ## TYPE[i])); \ - else{ \ - if (b.m64_i ## TYPE[i]>lanesize) res.m64_u ## TYPE[i] = ~((int ## TYPE ## _t) 0); \ - else{ \ - limit = (int ## TYPE ## _t) 1 << (lanesize - b.m64_i ## TYPE[i]); \ - res.m64_u ## TYPE[i] = ( a.m64_u ## TYPE[i] >= limit) ? res.m64_u ## TYPE[i] = ~((int ## TYPE ## _t) 0) : a.m64_u ## TYPE[i] << b.m64_u ## TYPE[i]; }}}} \ - return res; - -int8x8_t vqshl_s8(int8x8_t a, int8x8_t b); // VQSHL.S8 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int8x8_t vqshl_s8(int8x8_t a, int8x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_SIGNED_64(8,8) -} - -int16x4_t vqshl_s16(int16x4_t a, int16x4_t b); // VQSHL.S16 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int16x4_t vqshl_s16(int16x4_t a, int16x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_SIGNED_64(16,4) -} - -int32x2_t vqshl_s32(int32x2_t a, int32x2_t b); // VQSHL.S32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vqshl_s32(int32x2_t a, int32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_SIGNED_64(32,2) -} - -int64x1_t vqshl_s64(int64x1_t a, int64x1_t b); // VQSHL.S64 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x1_t vqshl_s64(int64x1_t a, int64x1_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_SIGNED_64(64,1) -} - -uint8x8_t vqshl_u8(uint8x8_t a, int8x8_t b); // VQSHL.U8 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint8x8_t vqshl_u8(uint8x8_t a, int8x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_UNSIGNED_64(8,8) -} - -uint16x4_t vqshl_u16(uint16x4_t a, int16x4_t b); // VQSHL.s16 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint16x4_t vqshl_u16(uint16x4_t a, int16x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_UNSIGNED_64(16,4) -} - -uint32x2_t vqshl_u32(uint32x2_t a, int32x2_t b); // VQSHL.U32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vqshl_u32(uint32x2_t a, int32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_UNSIGNED_64(32,2) -} - -uint64x1_t vqshl_u64(uint64x1_t a, int64x1_t b); // VQSHL.U64 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x1_t vqshl_u64(uint64x1_t a, int64x1_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_UNSIGNED_64(64,1) -} - -int8x16_t vqshlq_s8(int8x16_t a, int8x16_t b); // VQSHL.S8 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int8x16_t vqshlq_s8(int8x16_t a, int8x16_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_SIGNED(int8_t, 16, 16) -} - -int16x8_t vqshlq_s16(int16x8_t a, int16x8_t b); // VQSHL.S16 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int16x8_t vqshlq_s16(int16x8_t a, int16x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_SIGNED(int16_t, 8, 8) -} - -int32x4_t vqshlq_s32(int32x4_t a, int32x4_t b); // VQSHL.S32 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x4_t vqshlq_s32(int32x4_t a, int32x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_SIGNED(int32_t, 4, 4) -} - -int64x2_t vqshlq_s64(int64x2_t a, int64x2_t b); // VQSHL.S64 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqshlq_s64(int64x2_t a, int64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_SIGNED(int64_t, 2, 2) -} - -uint8x16_t vqshlq_u8(uint8x16_t a, int8x16_t b); // VQSHL.U8 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint8x16_t vqshlq_u8(uint8x16_t a, int8x16_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_UNSIGNED(int8_t, 16, 16) -} - -uint16x8_t vqshlq_u16(uint16x8_t a, int16x8_t b); // VQSHL.s16 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint16x8_t vqshlq_u16(uint16x8_t a, int16x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_UNSIGNED(int16_t, 8, 8) -} - -uint32x4_t vqshlq_u32(uint32x4_t a, int32x4_t b); // VQSHL.U32 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x4_t vqshlq_u32(uint32x4_t a, int32x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_UNSIGNED(int32_t, 4, 4) -} - -uint64x2_t vqshlq_u64(uint64x2_t a, int64x2_t b); // VQSHL.U64 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x2_t vqshlq_u64(uint64x2_t a, int64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_SHIFT_UNSIGNED(int64_t, 2, 2) -} - - -//******** Vector rounding shift left: (negative values shift right) ********** -//**************************************************************************** -//No such operations in IA32 SIMD available yet, constant shift only available, so need to do the serial solution -//rounding makes sense for right shifts only. -#define SERIAL_ROUNDING_SHIFT(TYPE, INTERNAL_TYPE, LENMAX, LEN) \ - _NEON2SSE_ALIGN_16 TYPE atmp[LENMAX], res[LENMAX]; _NEON2SSE_ALIGN_16 INTERNAL_TYPE btmp[LENMAX]; INTERNAL_TYPE i, lanesize = sizeof(INTERNAL_TYPE) << 3; \ - _mm_store_si128((__m128i*)atmp, a); _mm_store_si128((__m128i*)btmp, b); \ - for (i = 0; i<LEN; i++) { \ - if( btmp[i] >= 0) { \ - if(btmp[i] >= lanesize) res[i] = 0; \ - else res[i] = (atmp[i] << btmp[i]); \ - }else{ \ - res[i] = (btmp[i] < -lanesize) ? res[i] = 0 : \ - (btmp[i] == -lanesize) ? (atmp[i] & ((INTERNAL_TYPE)1 << (-btmp[i] - 1))) >> (-btmp[i] - 1) : \ - (atmp[i] >> (-btmp[i])) + ( (atmp[i] & ((INTERNAL_TYPE)1 << (-btmp[i] - 1))) >> (-btmp[i] - 1) ); }} \ - return _mm_load_si128((__m128i*)res); - - -#define SERIAL_ROUNDING_SHIFT_64(TYPE, SIGN, LEN) \ - int ## TYPE ## x ## LEN ## _t res; int i; int lanesize = sizeof(int ## TYPE ## _t) << 3; \ - for (i = 0; i<LEN; i++) { \ - if( b.m64_i ## TYPE[i] >= 0) { \ - if(b.m64_i ## TYPE[i] >= lanesize) res.m64_ ## SIGN ## TYPE[i] = 0; \ - else res.m64_ ## SIGN ## TYPE[i] = (a.m64_ ## SIGN ## TYPE[i] << b.m64_i ## TYPE[i]); \ - }else{ \ - res.m64_ ## SIGN ## TYPE[i] = (b.m64_i ## TYPE[i] < -lanesize) ? res.m64_ ## SIGN ## TYPE[i] = 0 : \ - (b.m64_i ## TYPE[i] == -lanesize) ? (a.m64_ ## SIGN ## TYPE[i] & ((int ## TYPE ## _t) 1 << (-(b.m64_i ## TYPE[i]) - 1))) >> (-(b.m64_i ## TYPE[i]) - 1) : \ - (a.m64_ ## SIGN ## TYPE[i] >> (-(b.m64_i ## TYPE[i]))) + ( (a.m64_ ## SIGN ## TYPE[i] & ((int ## TYPE ## _t) 1 << (-(b.m64_i ## TYPE[i]) - 1))) >> (-(b.m64_i ## TYPE[i]) - 1) ); }} \ - return res; - - -int8x8_t vrshl_s8(int8x8_t a, int8x8_t b); // VRSHL.S8 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int8x8_t vrshl_s8(int8x8_t a, int8x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT_64(8,i,8) -} - -int16x4_t vrshl_s16(int16x4_t a, int16x4_t b); // VRSHL.S16 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int16x4_t vrshl_s16(int16x4_t a, int16x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT_64(16,i,4) -} - -int32x2_t vrshl_s32(int32x2_t a, int32x2_t b); // VRSHL.S32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vrshl_s32(int32x2_t a, int32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT_64(32,i,2) -} - -int64x1_t vrshl_s64(int64x1_t a, int64x1_t b); // VRSHL.S64 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x1_t vrshl_s64(int64x1_t a, int64x1_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT_64(64,i,1) -} - -uint8x8_t vrshl_u8(uint8x8_t a, int8x8_t b); // VRSHL.U8 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint8x8_t vrshl_u8(uint8x8_t a, int8x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT_64(8,u,8) -} - -uint16x4_t vrshl_u16(uint16x4_t a, int16x4_t b); // VRSHL.s16 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint16x4_t vrshl_u16(uint16x4_t a, int16x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT_64(16,u,4) -} - -uint32x2_t vrshl_u32(uint32x2_t a, int32x2_t b); // VRSHL.U32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vrshl_u32(uint32x2_t a, int32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT_64(32,u,2) -} - -uint64x1_t vrshl_u64(uint64x1_t a, int64x1_t b); // VRSHL.U64 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x1_t vrshl_u64(uint64x1_t a, int64x1_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT_64(64,u,1) -} - -int8x16_t vrshlq_s8(int8x16_t a, int8x16_t b); // VRSHL.S8 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int8x16_t vrshlq_s8(int8x16_t a, int8x16_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT(int8_t, int8_t, 16, 16) -} - -int16x8_t vrshlq_s16(int16x8_t a, int16x8_t b); // VRSHL.S16 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int16x8_t vrshlq_s16(int16x8_t a, int16x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT(int16_t, int16_t, 8, 8) -} - -int32x4_t vrshlq_s32(int32x4_t a, int32x4_t b); // VRSHL.S32 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x4_t vrshlq_s32(int32x4_t a, int32x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT(int32_t, int32_t, 4, 4) -} - -int64x2_t vrshlq_s64(int64x2_t a, int64x2_t b); // VRSHL.S64 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vrshlq_s64(int64x2_t a, int64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT(int64_t, int64_t, 2, 2) -} - -uint8x16_t vrshlq_u8(uint8x16_t a, int8x16_t b); // VRSHL.U8 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint8x16_t vrshlq_u8(uint8x16_t a, int8x16_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT(uint8_t, int8_t, 16, 16) -} - -uint16x8_t vrshlq_u16(uint16x8_t a, int16x8_t b); // VRSHL.s16 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint16x8_t vrshlq_u16(uint16x8_t a, int16x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT(uint16_t, int16_t, 8, 8) -} - -uint32x4_t vrshlq_u32(uint32x4_t a, int32x4_t b); // VRSHL.U32 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x4_t vrshlq_u32(uint32x4_t a, int32x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT(uint32_t, int32_t, 4, 4) -} - -uint64x2_t vrshlq_u64(uint64x2_t a, int64x2_t b); // VRSHL.U64 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x2_t vrshlq_u64(uint64x2_t a, int64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_ROUNDING_SHIFT(uint64_t, int64_t, 2, 2) -} - - -//********** Vector saturating rounding shift left: (negative values shift right) **************** -//************************************************************************************************* -//No such operations in IA32 SIMD unfortunately, constant shift only available, so need to do the serial solution -//Saturation happens for left shifts only while rounding makes sense for right shifts only. -#define SERIAL_SATURATING_ROUNDING_SHIFT_SIGNED(TYPE, LENMAX, LEN) \ - _NEON2SSE_ALIGN_16 TYPE atmp[LENMAX], res[LENMAX], btmp[LENMAX]; TYPE limit; int i; \ - int lanesize_1 = (sizeof(TYPE) << 3) - 1; \ - _mm_store_si128((__m128i*)atmp, a); _mm_store_si128((__m128i*)btmp, b); \ - for (i = 0; i<LEN; i++) { \ - if (atmp[i] ==0) res[i] = 0; \ - else{ \ - if(btmp[i] <0) res[i] = (btmp[i] < (-lanesize_1)) ? 0 : (atmp[i] >> (-btmp[i])) + ( (atmp[i] & ((TYPE)1 << (-btmp[i] - 1))) >> (-btmp[i] - 1) ); \ - else{ \ - if (btmp[i]>lanesize_1) { \ - res[i] = ((_UNSIGNED_T(TYPE))atmp[i] >> lanesize_1 ) + ((TYPE)1 << lanesize_1) - 1; \ - }else{ \ - limit = (TYPE)1 << (lanesize_1 - btmp[i]); \ - if((atmp[i] >= limit)||(atmp[i] <= -limit)) \ - res[i] = ((_UNSIGNED_T(TYPE))atmp[i] >> lanesize_1 ) + ((TYPE)1 << lanesize_1) - 1; \ - else res[i] = atmp[i] << btmp[i]; }}}} \ - return _mm_load_si128((__m128i*)res); - -#define SERIAL_SATURATING_ROUNDING_SHIFT_UNSIGNED(TYPE, LENMAX, LEN) \ - _NEON2SSE_ALIGN_16 _UNSIGNED_T(TYPE) atmp[LENMAX], res[LENMAX]; _NEON2SSE_ALIGN_16 TYPE btmp[LENMAX]; _UNSIGNED_T(TYPE) limit; int i; \ - int lanesize = (sizeof(TYPE) << 3); \ - _mm_store_si128((__m128i*)atmp, a); _mm_store_si128((__m128i*)btmp, b); \ - for (i = 0; i<LEN; i++) { \ - if (atmp[i] ==0) {res[i] = 0; \ - }else{ \ - if(btmp[i] < 0) res[i] = (btmp[i] < (-lanesize)) ? 0 : (atmp[i] >> (-btmp[i])) + ( (atmp[i] & ((TYPE)1 << (-btmp[i] - 1))) >> (-btmp[i] - 1) ); \ - else{ \ - if (btmp[i]>lanesize) res[i] = ~((TYPE)0); \ - else{ \ - limit = (TYPE) 1 << (lanesize - btmp[i]); \ - res[i] = ( atmp[i] >= limit) ? res[i] = ~((TYPE)0) : atmp[i] << btmp[i]; }}}} \ - return _mm_load_si128((__m128i*)res); - -#define SERIAL_SATURATING_ROUNDING_SHIFT_SIGNED_64(TYPE, LEN) \ - __m64_128 res; int ## TYPE ## _t limit; int i; \ - int lanesize_1 = (sizeof(int ## TYPE ## _t ) << 3) - 1; \ - for (i = 0; i<LEN; i++) { \ - if (a.m64_i ## TYPE[i] ==0) res.m64_i ## TYPE[i] = 0; \ - else{ \ - if(b.m64_i ## TYPE[i] <0) res.m64_i ## TYPE[i] = (b.m64_i ## TYPE[i] < (-lanesize_1)) ? 0 : (a.m64_i ## TYPE[i] >> (-(b.m64_i ## TYPE[i]))) + ( (a.m64_i ## TYPE[i] & ((int ## TYPE ## _t ) 1 << (-(b.m64_i ## TYPE[i]) - 1))) >> (-(b.m64_i ## TYPE[i]) - 1) ); \ - else{ \ - if (b.m64_i ## TYPE[i]>lanesize_1) { \ - res.m64_i ## TYPE[i] = ((_UNSIGNED_T(int ## TYPE ## _t ))a.m64_i ## TYPE[i] >> lanesize_1 ) + ((int ## TYPE ## _t ) 1 << lanesize_1) - 1; \ - }else{ \ - limit = (int ## TYPE ## _t ) 1 << (lanesize_1 - b.m64_i ## TYPE[i]); \ - if((a.m64_i ## TYPE[i] >= limit)||(a.m64_i ## TYPE[i] <= -limit)) \ - res.m64_i ## TYPE[i] = ((_UNSIGNED_T(int ## TYPE ## _t ))a.m64_i ## TYPE[i] >> lanesize_1 ) + ((int ## TYPE ## _t ) 1 << lanesize_1) - 1; \ - else res.m64_i ## TYPE[i] = a.m64_i ## TYPE[i] << b.m64_i ## TYPE[i]; }}}} \ - return res; - -#define SERIAL_SATURATING_ROUNDING_SHIFT_UNSIGNED_64(TYPE, LEN) \ - __m64_128 res; _UNSIGNED_T(int ## TYPE ## _t) limit; int i; \ - int lanesize = (sizeof(int ## TYPE ## _t) << 3); \ - for (i = 0; i<LEN; i++) { \ - if (a.m64_u ## TYPE[i] ==0) {res.m64_u ## TYPE[i] = 0; \ - }else{ \ - if(b.m64_i ## TYPE[i] < 0) res.m64_u ## TYPE[i] = (b.m64_i ## TYPE[i] < (-lanesize)) ? 0 : (a.m64_u ## TYPE[i] >> (-(b.m64_i ## TYPE[i]))) + ( (a.m64_u ## TYPE[i] & ((int ## TYPE ## _t) 1 << (-(b.m64_i ## TYPE[i]) - 1))) >> (-(b.m64_i ## TYPE[i]) - 1) ); \ - else{ \ - if (b.m64_i ## TYPE[i]>lanesize) res.m64_u ## TYPE[i] = ~((int ## TYPE ## _t) 0); \ - else{ \ - limit = (int ## TYPE ## _t) 1 << (lanesize - b.m64_i ## TYPE[i]); \ - res.m64_u ## TYPE[i] = ( a.m64_u ## TYPE[i] >= limit) ? res.m64_u ## TYPE[i] = ~((int ## TYPE ## _t) 0) : a.m64_u ## TYPE[i] << b.m64_i ## TYPE[i]; }}}} \ - return res; - -int8x8_t vqrshl_s8(int8x8_t a, int8x8_t b); // VQRSHL.S8 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int8x8_t vqrshl_s8(int8x8_t a, int8x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_SIGNED_64(8,8) -} - -int16x4_t vqrshl_s16(int16x4_t a, int16x4_t b); // VQRSHL.S16 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int16x4_t vqrshl_s16(int16x4_t a, int16x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_SIGNED_64(16,4) -} - -int32x2_t vqrshl_s32(int32x2_t a, int32x2_t b); // VQRSHL.S32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vqrshl_s32(int32x2_t a, int32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_SIGNED_64(32,2) -} - -int64x1_t vqrshl_s64(int64x1_t a, int64x1_t b); // VQRSHL.S64 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x1_t vqrshl_s64(int64x1_t a, int64x1_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_SIGNED_64(64,1) -} - -uint8x8_t vqrshl_u8(uint8x8_t a, int8x8_t b); // VQRSHL.U8 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint8x8_t vqrshl_u8(uint8x8_t a, int8x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_UNSIGNED_64(8,8) -} - -uint16x4_t vqrshl_u16(uint16x4_t a, int16x4_t b); // VQRSHL.s16 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint16x4_t vqrshl_u16(uint16x4_t a, int16x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_UNSIGNED_64(16,4) -} - -uint32x2_t vqrshl_u32(uint32x2_t a, int32x2_t b); // VQRSHL.U32 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vqrshl_u32(uint32x2_t a, int32x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_UNSIGNED_64(32,2) -} - -uint64x1_t vqrshl_u64(uint64x1_t a, int64x1_t b); // VQRSHL.U64 d0,d0,d0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x1_t vqrshl_u64(uint64x1_t a, int64x1_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_UNSIGNED_64(64,1) -} - -int8x16_t vqrshlq_s8(int8x16_t a, int8x16_t b); // VQRSHL.S8 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int8x16_t vqrshlq_s8(int8x16_t a, int8x16_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_SIGNED(int8_t, 16, 16) -} - -int16x8_t vqrshlq_s16(int16x8_t a, int16x8_t b); // VQRSHL.S16 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int16x8_t vqrshlq_s16(int16x8_t a, int16x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_SIGNED(int16_t, 8, 8) -} - -int32x4_t vqrshlq_s32(int32x4_t a, int32x4_t b); // VQRSHL.S32 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x4_t vqrshlq_s32(int32x4_t a, int32x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_SIGNED(int32_t, 4, 4) -} - -int64x2_t vqrshlq_s64(int64x2_t a, int64x2_t b); // VQRSHL.S64 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqrshlq_s64(int64x2_t a, int64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_SIGNED(int64_t, 2, 2) -} - -uint8x16_t vqrshlq_u8(uint8x16_t a, int8x16_t b); // VQRSHL.U8 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint8x16_t vqrshlq_u8(uint8x16_t a, int8x16_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_UNSIGNED(int8_t, 16, 16) -} - -uint16x8_t vqrshlq_u16(uint16x8_t a, int16x8_t b); // VQRSHL.s16 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint16x8_t vqrshlq_u16(uint16x8_t a, int16x8_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_UNSIGNED(int16_t, 8, 8) -} - -uint32x4_t vqrshlq_u32(uint32x4_t a, int32x4_t b); // VQRSHL.U32 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x4_t vqrshlq_u32(uint32x4_t a, int32x4_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_UNSIGNED(int32_t, 4, 4) -} - -uint64x2_t vqrshlq_u64(uint64x2_t a, int64x2_t b); // VQRSHL.U64 q0,q0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x2_t vqrshlq_u64(uint64x2_t a, int64x2_t b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - SERIAL_SATURATING_ROUNDING_SHIFT_UNSIGNED(int64_t, 2, 2) -} - -// ********************************************************************************* -// ***************************** Shifts by a constant ***************************** -// ********************************************************************************* -//**************** Vector shift right by constant************************************* -//************************************************************************************ -int8x8_t vshr_n_s8(int8x8_t a, __constrange(1,8) int b); // VSHR.S8 d0,d0,#8 -_NEON2SSE_INLINE int8x8_t vshr_n_s8(int8x8_t a, __constrange(1,8) int b) // VSHR.S8 d0,d0,#8 -{ - //no 8 bit shift available, go to 16 bit - int8x8_t res64; - __m128i r; - r = _MM_CVTEPI8_EPI16 (_pM128i(a)); //SSE 4.1 - r = _mm_srai_epi16 (r, b); //SSE2 - r = _mm_packs_epi16 (r,r); //we need 64 bits only - return64(r); -} - -int16x4_t vshr_n_s16(int16x4_t a, __constrange(1,16) int b); // VSHR.S16 d0,d0,#16 -_NEON2SSE_INLINE int16x4_t vshr_n_s16(int16x4_t a, __constrange(1,16) int b) -{ - int16x4_t res64; - return64(_mm_srai_epi16(_pM128i(a), b)); -} - - -int32x2_t vshr_n_s32(int32x2_t a, __constrange(1,32) int b); // VSHR.S32 d0,d0,#32 -_NEON2SSE_INLINE int32x2_t vshr_n_s32(int32x2_t a, __constrange(1,32) int b) -{ - int32x2_t res64; - return64(_mm_srai_epi32(_pM128i(a), b)); -} - -int64x1_t vshr_n_s64(int64x1_t a, __constrange(1,64) int b); // VSHR.S64 d0,d0,#64 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x1_t vshr_n_s64(int64x1_t a, __constrange(1,64) int b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //no arithmetic shift for 64bit values, serial solution used - int64x1_t res; - if(b>=64) res.m64_i64[0] = 0; - else res.m64_i64[0] = (*(int64_t*)&a) >> b; - return res; -} - -uint8x8_t vshr_n_u8(uint8x8_t a, __constrange(1,8) int b); // VSHR.U8 d0,d0,#8 -_NEON2SSE_INLINE uint8x8_t vshr_n_u8(uint8x8_t a, __constrange(1,8) int b) // VSHR.U8 d0,d0,#8 -{ - //no 8 bit shift available, go to 16 bit - uint8x8_t res64; - __m128i r; - r = _MM_CVTEPU8_EPI16 (_pM128i(a)); //SSE 4.1 - r = _mm_srli_epi16 (r, b); //for unsigned variables we use the logical shift not arithmetical one - r = _mm_packus_epi16 (r,r); //we need 64 bits only - return64(r); -} - -uint16x4_t vshr_n_u16(uint16x4_t a, __constrange(1,16) int b); // VSHR.s16 d0,d0,#16 -_NEON2SSE_INLINE uint16x4_t vshr_n_u16(uint16x4_t a, __constrange(1,16) int b) -{ - uint16x4_t res64; - return64(_mm_srli_epi16(_pM128i(a), b)); -} - - -uint32x2_t vshr_n_u32(uint32x2_t a, __constrange(1,32) int b); // VSHR.U32 d0,d0,#32 -_NEON2SSE_INLINE uint32x2_t vshr_n_u32(uint32x2_t a, __constrange(1,32) int b) -{ - uint32x2_t res64; - return64(_mm_srli_epi32(_pM128i(a), b)); -} - - -uint64x1_t vshr_n_u64(uint64x1_t a, __constrange(1,64) int b); // VSHR.U64 d0,d0,#64 -_NEON2SSE_INLINE uint64x1_t vshr_n_u64(uint64x1_t a, __constrange(1,64) int b) -{ - uint64x1_t res64; - return64(_mm_srli_epi64(_pM128i(a), b)); -} - - -int8x16_t vshrq_n_s8(int8x16_t a, __constrange(1,8) int b); // VSHR.S8 q0,q0,#8 -_NEON2SSE_INLINE int8x16_t vshrq_n_s8(int8x16_t a, __constrange(1,8) int b) // VSHR.S8 q0,q0,#8 -{ - //no 8 bit shift available, go to 16 bit trick - __m128i zero, mask0, a_sign, r, a_sign_mask; - _NEON2SSE_ALIGN_16 int16_t mask0_16[9] = {0x0000, 0x0080, 0x00c0, 0x00e0, 0x00f0, 0x00f8, 0x00fc, 0x00fe, 0x00ff}; - zero = _mm_setzero_si128(); - mask0 = _mm_set1_epi16(mask0_16[b]); //to mask the bits to be "spoiled" by 16 bit shift - a_sign = _mm_cmpgt_epi8 (zero, a); //ff if a<0 or zero if a>0 - r = _mm_srai_epi16 (a, b); - a_sign_mask = _mm_and_si128 (mask0, a_sign); - r = _mm_andnot_si128 (mask0, r); - return _mm_or_si128 (r, a_sign_mask); -} - -int16x8_t vshrq_n_s16(int16x8_t a, __constrange(1,16) int b); // VSHR.S16 q0,q0,#16 -#define vshrq_n_s16 _mm_srai_epi16 - -int32x4_t vshrq_n_s32(int32x4_t a, __constrange(1,32) int b); // VSHR.S32 q0,q0,#32 -#define vshrq_n_s32 _mm_srai_epi32 - -int64x2_t vshrq_n_s64(int64x2_t a, __constrange(1,64) int b); // VSHR.S64 q0,q0,#64 -_NEON2SSE_INLINE int64x2_t vshrq_n_s64(int64x2_t a, __constrange(1,64) int b) -{ - //SIMD implementation may be not optimal due to 64 bit arithmetic shift absense in x86 SIMD - __m128i c1, signmask,a0, res64; - _NEON2SSE_ALIGN_16 uint64_t mask[] = {0x8000000000000000, 0x8000000000000000}; - c1 = _mm_cmpeq_epi32(a,a); //0xffffffffffffffff - signmask = _mm_slli_epi64 (c1, (64 - b)); - a0 = _mm_or_si128(a, *(__m128i*)mask); //get the first bit - a0 = _MM_CMPEQ_EPI64 (a, a0); - signmask = _mm_and_si128(a0, signmask); - res64 = _mm_srli_epi64 (a, b); - return _mm_or_si128(res64, signmask); -} - -uint8x16_t vshrq_n_u8(uint8x16_t a, __constrange(1,8) int b); // VSHR.U8 q0,q0,#8 -_NEON2SSE_INLINE uint8x16_t vshrq_n_u8(uint8x16_t a, __constrange(1,8) int b) // VSHR.U8 q0,q0,#8 -{ - //no 8 bit shift available, need the special trick - __m128i mask0, r; - _NEON2SSE_ALIGN_16 uint16_t mask10_16[9] = {0xffff, 0xff7f, 0xff3f, 0xff1f, 0xff0f, 0xff07, 0xff03, 0xff01, 0xff00}; - mask0 = _mm_set1_epi16(mask10_16[b]); //to mask the bits to be "spoiled" by 16 bit shift - r = _mm_srli_epi16 ( a, b); - return _mm_and_si128 (r, mask0); -} - -uint16x8_t vshrq_n_u16(uint16x8_t a, __constrange(1,16) int b); // VSHR.s16 q0,q0,#16 -#define vshrq_n_u16 _mm_srli_epi16 - -uint32x4_t vshrq_n_u32(uint32x4_t a, __constrange(1,32) int b); // VSHR.U32 q0,q0,#32 -#define vshrq_n_u32 _mm_srli_epi32 - -uint64x2_t vshrq_n_u64(uint64x2_t a, __constrange(1,64) int b); // VSHR.U64 q0,q0,#64 -#define vshrq_n_u64 _mm_srli_epi64 - -//*************************** Vector shift left by constant ************************* -//********************************************************************************* -int8x8_t vshl_n_s8(int8x8_t a, __constrange(0,7) int b); // VSHL.I8 d0,d0,#0 -_NEON2SSE_INLINE int8x8_t vshl_n_s8(int8x8_t a, __constrange(0,7) int b) // VSHL.I8 d0,d0,#0 -{ - //no 8 bit shift available, go to 16 bit - int8x8_t res64; - __m128i r; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - r = _MM_CVTEPI8_EPI16 (_pM128i(a)); //SSE 4.1 - r = _mm_slli_epi16 (r, b); //SSE2 - r = _mm_shuffle_epi8 (r, *(__m128i*) mask8_16_even_odd); //return to 8 bit, we need 64 bits only - return64(r); -} - -int16x4_t vshl_n_s16(int16x4_t a, __constrange(0,15) int b); // VSHL.I16 d0,d0,#0 -_NEON2SSE_INLINE int16x4_t vshl_n_s16(int16x4_t a, __constrange(0,15) int b) -{ - int16x4_t res64; - return64(_mm_slli_epi16(_pM128i(a), b)); -} - - -int32x2_t vshl_n_s32(int32x2_t a, __constrange(0,31) int b); // VSHL.I32 d0,d0,#0 -_NEON2SSE_INLINE int32x2_t vshl_n_s32(int32x2_t a, __constrange(0,31) int b) -{ - int32x2_t res64; - return64(_mm_slli_epi32(_pM128i(a), b)); -} - - -int64x1_t vshl_n_s64(int64x1_t a, __constrange(0,63) int b); // VSHL.I64 d0,d0,#0 -_NEON2SSE_INLINE int64x1_t vshl_n_s64(int64x1_t a, __constrange(0,63) int b) -{ - int64x1_t res64; - return64(_mm_slli_epi64(_pM128i(a), b)); -} - - -uint8x8_t vshl_n_u8(uint8x8_t a, __constrange(0,7) int b); // VSHL.I8 d0,d0,#0 -_NEON2SSE_INLINE uint8x8_t vshl_n_u8(uint8x8_t a, __constrange(0,7) int b) -{ - //no 8 bit shift available, go to 16 bit - uint8x8_t res64; - __m128i mask8; - __m128i r; - mask8 = _mm_set1_epi16(0xff); - r = _MM_CVTEPU8_EPI16 (_pM128i(a)); //SSE 4.1 - r = _mm_slli_epi16 (r, b); //SSE2 - r = _mm_and_si128(r, mask8); //to avoid saturation - r = _mm_packus_epi16 (r,r); //we need 64 bits only - return64(r); -} - -uint16x4_t vshl_n_u16(uint16x4_t a, __constrange(0,15) int b); // VSHL.I16 d0,d0,#0 -#define vshl_n_u16 vshl_n_s16 - - -uint32x2_t vshl_n_u32(uint32x2_t a, __constrange(0,31) int b); // VSHL.I32 d0,d0,#0 -#define vshl_n_u32 vshl_n_s32 - -uint64x1_t vshl_n_u64(uint64x1_t a, __constrange(0,63) int b); // VSHL.I64 d0,d0,#0 -#define vshl_n_u64 vshl_n_s64 - -int8x16_t vshlq_n_s8(int8x16_t a, __constrange(0,7) int b); // VSHL.I8 q0,q0,#0 -#define vshlq_n_s8 vshlq_n_u8 - -int16x8_t vshlq_n_s16(int16x8_t a, __constrange(0,15) int b); // VSHL.I16 q0,q0,#0 -#define vshlq_n_s16 _mm_slli_epi16 - -int32x4_t vshlq_n_s32(int32x4_t a, __constrange(0,31) int b); // VSHL.I32 q0,q0,#0 -#define vshlq_n_s32 _mm_slli_epi32 - -int64x2_t vshlq_n_s64(int64x2_t a, __constrange(0,63) int b); // VSHL.I64 q0,q0,#0 -#define vshlq_n_s64 _mm_slli_epi64 - -uint8x16_t vshlq_n_u8(uint8x16_t a, __constrange(0,7) int b); // VSHL.I8 q0,q0,#0 -_NEON2SSE_INLINE uint8x16_t vshlq_n_u8(uint8x16_t a, __constrange(0,7) int b) -{ - //no 8 bit shift available, need the special trick - __m128i mask0, r; - _NEON2SSE_ALIGN_16 uint16_t mask10_16[9] = {0xffff, 0xfeff, 0xfcff, 0xf8ff, 0xf0ff, 0xe0ff, 0xc0ff, 0x80ff, 0xff}; - mask0 = _mm_set1_epi16(mask10_16[b]); //to mask the bits to be "spoiled" by 16 bit shift - r = _mm_slli_epi16 ( a, b); - return _mm_and_si128 (r, mask0); -} - -uint16x8_t vshlq_n_u16(uint16x8_t a, __constrange(0,15) int b); // VSHL.I16 q0,q0,#0 -#define vshlq_n_u16 vshlq_n_s16 - -uint32x4_t vshlq_n_u32(uint32x4_t a, __constrange(0,31) int b); // VSHL.I32 q0,q0,#0 -#define vshlq_n_u32 vshlq_n_s32 - -uint64x2_t vshlq_n_u64(uint64x2_t a, __constrange(0,63) int b); // VSHL.I64 q0,q0,#0 -#define vshlq_n_u64 vshlq_n_s64 - -//************* Vector rounding shift right by constant ****************** -//************************************************************************* -//No corresponding x86 intrinsics exist, need to do some tricks -int8x8_t vrshr_n_s8(int8x8_t a, __constrange(1,8) int b); // VRSHR.S8 d0,d0,#8 -_NEON2SSE_INLINE int8x8_t vrshr_n_s8(int8x8_t a, __constrange(1,8) int b) // VRSHR.S8 d0,d0,#8 -{ - //no 8 bit shift available, go to 16 bit - int8x8_t res64; - __m128i r, maskb; - r = _MM_CVTEPI8_EPI16 (_pM128i(a)); //SSE 4.1 - maskb = _mm_slli_epi16 (r, (16 - b)); //to get rounding (b-1)th bit - maskb = _mm_srli_epi16 (maskb, 15); //1 or 0 - r = _mm_srai_epi16 (r, b); - r = _mm_add_epi16 (r, maskb); //actual rounding - r = _mm_packs_epi16 (r,r); ////we need 64 bits only - return64(r); -} - -int16x4_t vrshr_n_s16(int16x4_t a, __constrange(1,16) int b); // VRSHR.S16 d0,d0,#16 -_NEON2SSE_INLINE int16x4_t vrshr_n_s16(int16x4_t a, __constrange(1,16) int b) -{ - int16x4_t res64; - return64(vrshrq_n_s16(_pM128i(a), b)); -} - - -int32x2_t vrshr_n_s32(int32x2_t a, __constrange(1,32) int b); // VRSHR.S32 d0,d0,#32 -_NEON2SSE_INLINE int32x2_t vrshr_n_s32(int32x2_t a, __constrange(1,32) int b) -{ - int32x2_t res64; - return64(vrshrq_n_s32(_pM128i(a), b)); -} - - -int64x1_t vrshr_n_s64(int64x1_t a, __constrange(1,64) int b); // VRSHR.S64 d0,d0,#64 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x1_t vrshr_n_s64(int64x1_t a, __constrange(1,64) int b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - //serial solution is faster - int64x1_t res; - int64_t a_i64 = *( int64_t*)&a; - if(b==64) { - res.m64_i64[0] = 0; //for some compilers rounding happens and we need to use(a_i64 & _SIGNBIT64)>>63; - } else { - int64_t maskb = a_i64 & (( int64_t)1 << (b - 1)); - res.m64_i64[0] = (a_i64 >> b) + (maskb >> (b - 1)); - } - return res; -} - -uint8x8_t vrshr_n_u8(uint8x8_t a, __constrange(1,8) int b); // VRSHR.U8 d0,d0,#8 -_NEON2SSE_INLINE uint8x8_t vrshr_n_u8(uint8x8_t a, __constrange(1,8) int b) // VRSHR.U8 d0,d0,#8 -{ - //no 8 bit shift available, go to 16 bit, solution may be not optimal compared with the serial one - uint8x8_t res64; - __m128i r, maskb; - r = _MM_CVTEPU8_EPI16 (_pM128i(a)); //SSE 4.1 - maskb = _mm_slli_epi16 (r, (16 - b)); //to get rounding (b-1)th bit - maskb = _mm_srli_epi16 (maskb, 15); //1 or 0 - r = _mm_srli_epi16 (r, b); - r = _mm_add_epi16 (r, maskb); //actual rounding - r = _mm_packus_epi16 (r,r); ////we need 64 bits only - return64(r); -} - -uint16x4_t vrshr_n_u16(uint16x4_t a, __constrange(1,16) int b); // VRSHR.s16 d0,d0,#16 -_NEON2SSE_INLINE uint16x4_t vrshr_n_u16(uint16x4_t a, __constrange(1,16) int b) -{ - uint16x4_t res64; - return64(vrshrq_n_u16(_pM128i(a), b)); -} - - -uint32x2_t vrshr_n_u32(uint32x2_t a, __constrange(1,32) int b); // VRSHR.U32 d0,d0,#32 -_NEON2SSE_INLINE uint32x2_t vrshr_n_u32(uint32x2_t a, __constrange(1,32) int b) -{ - uint32x2_t res64; - return64(vrshrq_n_u32(_pM128i(a), b)); -} - - -uint64x1_t vrshr_n_u64(uint64x1_t a, __constrange(1,64) int b); // VRSHR.U64 d0,d0,#64 -_NEON2SSE_INLINE uint64x1_t vrshr_n_u64(uint64x1_t a, __constrange(1,64) int b) -{ - uint64x1_t res64; - return64(vrshrq_n_u64(_pM128i(a), b)); -} - -int8x16_t vrshrq_n_s8(int8x16_t a, __constrange(1,8) int b); // VRSHR.S8 q0,q0,#8 -_NEON2SSE_INLINE int8x16_t vrshrq_n_s8(int8x16_t a, __constrange(1,8) int b) // VRSHR.S8 q0,q0,#8 -{ - //no 8 bit shift available, go to 16 bit trick - __m128i r, mask1, maskb; - _NEON2SSE_ALIGN_16 uint16_t mask2b[9] = {0x0000, 0x0101, 0x0202, 0x0404, 0x0808, 0x1010, 0x2020, 0x4040, 0x8080}; // 2^b-th bit set to 1 - r = vshrq_n_s8 (a, b); - mask1 = _mm_set1_epi16(mask2b[b]); // 2^b-th bit set to 1 for 16bit, need it for rounding - maskb = _mm_and_si128(a, mask1); //get b or 0 for rounding - maskb = _mm_srli_epi16 (maskb, b - 1); // to add 1 - return _mm_add_epi8(r, maskb); //actual rounding -} - -int16x8_t vrshrq_n_s16(int16x8_t a, __constrange(1,16) int b); // VRSHR.S16 q0,q0,#16 -_NEON2SSE_INLINE int16x8_t vrshrq_n_s16(int16x8_t a, __constrange(1,16) int b) // VRSHR.S16 q0,q0,#16 -{ - __m128i maskb, r; - maskb = _mm_slli_epi16(a, (16 - b)); //to get rounding (b-1)th bit - maskb = _mm_srli_epi16(maskb, 15); //1 or 0 - r = _mm_srai_epi16 (a, b); - return _mm_add_epi16 (r, maskb); //actual rounding -} - -int32x4_t vrshrq_n_s32(int32x4_t a, __constrange(1,32) int b); // VRSHR.S32 q0,q0,#32 -_NEON2SSE_INLINE int32x4_t vrshrq_n_s32(int32x4_t a, __constrange(1,32) int b) // VRSHR.S32 q0,q0,#32 -{ - __m128i maskb, r; - maskb = _mm_slli_epi32 (a, (32 - b)); //to get rounding (b-1)th bit - maskb = _mm_srli_epi32 (maskb,31); //1 or 0 - r = _mm_srai_epi32(a, b); - return _mm_add_epi32 (r, maskb); //actual rounding -} - -int64x2_t vrshrq_n_s64(int64x2_t a, __constrange(1,64) int b); // VRSHR.S64 q0,q0,#64 -_NEON2SSE_INLINE int64x2_t vrshrq_n_s64(int64x2_t a, __constrange(1,64) int b) -{ - //solution may be not optimal compared with a serial one - __m128i maskb; - int64x2_t r; - maskb = _mm_slli_epi64 (a, (64 - b)); //to get rounding (b-1)th bit - maskb = _mm_srli_epi64 (maskb,63); //1 or 0 - r = vshrq_n_s64(a, b); - return _mm_add_epi64 (r, maskb); //actual rounding -} - -uint8x16_t vrshrq_n_u8(uint8x16_t a, __constrange(1,8) int b); // VRSHR.U8 q0,q0,#8 -_NEON2SSE_INLINE uint8x16_t vrshrq_n_u8(uint8x16_t a, __constrange(1,8) int b) // VRSHR.U8 q0,q0,#8 -{ - //no 8 bit shift available, go to 16 bit trick - __m128i r, mask1, maskb; - _NEON2SSE_ALIGN_16 uint16_t mask2b[9] = {0x0000, 0x0101, 0x0202, 0x0404, 0x0808, 0x1010, 0x2020, 0x4040, 0x8080}; // 2^b-th bit set to 1 - r = vshrq_n_u8 (a, b); - mask1 = _mm_set1_epi16(mask2b[b]); // 2^b-th bit set to 1 for 16bit, need it for rounding - maskb = _mm_and_si128(a, mask1); //get b or 0 for rounding - maskb = _mm_srli_epi16 (maskb, b - 1); // to add 1 - return _mm_add_epi8(r, maskb); //actual rounding -} - -uint16x8_t vrshrq_n_u16(uint16x8_t a, __constrange(1,16) int b); // VRSHR.s16 q0,q0,#16 -_NEON2SSE_INLINE uint16x8_t vrshrq_n_u16(uint16x8_t a, __constrange(1,16) int b) // VRSHR.S16 q0,q0,#16 -{ - __m128i maskb, r; - maskb = _mm_slli_epi16(a, (16 - b)); //to get rounding (b-1)th bit - maskb = _mm_srli_epi16(maskb, 15); //1 or 0 - r = _mm_srli_epi16 (a, b); - return _mm_add_epi16 (r, maskb); //actual rounding -} - -uint32x4_t vrshrq_n_u32(uint32x4_t a, __constrange(1,32) int b); // VRSHR.U32 q0,q0,#32 -_NEON2SSE_INLINE uint32x4_t vrshrq_n_u32(uint32x4_t a, __constrange(1,32) int b) // VRSHR.S32 q0,q0,#32 -{ - __m128i maskb, r; - maskb = _mm_slli_epi32 (a, (32 - b)); //to get rounding (b-1)th bit - maskb = _mm_srli_epi32 (maskb,31); //1 or 0 - r = _mm_srli_epi32(a, b); - return _mm_add_epi32 (r, maskb); //actual rounding -} - -uint64x2_t vrshrq_n_u64(uint64x2_t a, __constrange(1,64) int b); // VRSHR.U64 q0,q0,#64 -_NEON2SSE_INLINE uint64x2_t vrshrq_n_u64(uint64x2_t a, __constrange(1,64) int b) -{ - //solution may be not optimal compared with a serial one - __m128i maskb, r; - maskb = _mm_slli_epi64 (a, (64 - b)); //to get rounding (b-1)th bit - maskb = _mm_srli_epi64 (maskb,63); //1 or 0 - r = _mm_srli_epi64(a, b); - return _mm_add_epi64 (r, maskb); //actual rounding -} - -//************* Vector shift right by constant and accumulate ********* -//********************************************************************* -int8x8_t vsra_n_s8(int8x8_t a, int8x8_t b, __constrange(1,8) int c); // VSRA.S8 d0,d0,#8 -_NEON2SSE_INLINE int8x8_t vsra_n_s8(int8x8_t a, int8x8_t b, __constrange(1,8) int c) // VSRA.S8 d0,d0,#8 -{ - int8x8_t shift; - shift = vshr_n_s8(b, c); - return vadd_s8( a, shift); -} - -int16x4_t vsra_n_s16(int16x4_t a, int16x4_t b, __constrange(1,16) int c); // VSRA.S16 d0,d0,#16 -_NEON2SSE_INLINE int16x4_t vsra_n_s16(int16x4_t a, int16x4_t b, __constrange(1,16) int c) // VSRA.S16 d0,d0,#16 -{ - int16x4_t shift; - shift = vshr_n_s16( b, c); - return vadd_s16(a, shift); -} - -int32x2_t vsra_n_s32(int32x2_t a, int32x2_t b, __constrange(1,32) int c); // VSRA.S32 d0,d0,#32 -_NEON2SSE_INLINE int32x2_t vsra_n_s32(int32x2_t a, int32x2_t b, __constrange(1,32) int c) // VSRA.S32 d0,d0,#32 -{ - //may be not optimal compared with the serial execution - int32x2_t shift; - shift = vshr_n_s32(b, c); - return vadd_s32( a, shift); -} - -int64x1_t vsra_n_s64(int64x1_t a, int64x1_t b, __constrange(1,64) int c); // VSRA.S64 d0,d0,#64 -_NEON2SSE_INLINE int64x1_t vsra_n_s64(int64x1_t a, int64x1_t b, __constrange(1,64) int c) -{ - //may be not optimal compared with a serial solution - int64x1_t shift; - shift = vshr_n_s64(b, c); - return vadd_s64( a, shift); -} - -uint8x8_t vsra_n_u8(uint8x8_t a, uint8x8_t b, __constrange(1,8) int c); // VSRA.U8 d0,d0,#8 -_NEON2SSE_INLINE uint8x8_t vsra_n_u8(uint8x8_t a, uint8x8_t b, __constrange(1,8) int c) // VSRA.U8 d0,d0,#8 -{ - uint8x8_t shift; - shift = vshr_n_u8(b, c); - return vadd_u8(a, shift); -} - -uint16x4_t vsra_n_u16(uint16x4_t a, uint16x4_t b, __constrange(1,16) int c); // VSRA.s16 d0,d0,#16 -_NEON2SSE_INLINE uint16x4_t vsra_n_u16(uint16x4_t a, uint16x4_t b, __constrange(1,16) int c) // VSRA.s16 d0,d0,#16 -{ - uint16x4_t shift; - shift = vshr_n_u16(b, c); - return vadd_u16(a,shift); -} - -uint32x2_t vsra_n_u32(uint32x2_t a, uint32x2_t b, __constrange(1,32) int c); // VSRA.U32 d0,d0,#32 -_NEON2SSE_INLINE uint32x2_t vsra_n_u32(uint32x2_t a, uint32x2_t b, __constrange(1,32) int c) // VSRA.U32 d0,d0,#32 -{ - //may be not optimal compared with the serial execution - uint32x2_t shift; - shift = vshr_n_u32(b, c); - return vadd_u32( a, shift); -} - -uint64x1_t vsra_n_u64(uint64x1_t a, uint64x1_t b, __constrange(1,64) int c); // VSRA.U64 d0,d0,#64 -_NEON2SSE_INLINE uint64x1_t vsra_n_u64(uint64x1_t a, uint64x1_t b, __constrange(1,64) int c) // VSRA.U64 d0,d0,#64 -{ - //may be not optimal compared with the serial execution - uint64x1_t shift; - shift = vshr_n_u64(b, c); - return vadd_u64(a, shift); -} - -int8x16_t vsraq_n_s8(int8x16_t a, int8x16_t b, __constrange(1,8) int c); // VSRA.S8 q0,q0,#8 -_NEON2SSE_INLINE int8x16_t vsraq_n_s8(int8x16_t a, int8x16_t b, __constrange(1,8) int c) // VSRA.S8 q0,q0,#8 -{ - int8x16_t shift; - shift = vshrq_n_s8(b, c); - return vaddq_s8(a, shift); -} - -int16x8_t vsraq_n_s16(int16x8_t a, int16x8_t b, __constrange(1,16) int c); // VSRA.S16 q0,q0,#16 -_NEON2SSE_INLINE int16x8_t vsraq_n_s16(int16x8_t a, int16x8_t b, __constrange(1,16) int c) // VSRA.S16 q0,q0,#16 -{ - int16x8_t shift; - shift = vshrq_n_s16(b, c); - return vaddq_s16(a, shift); -} - -int32x4_t vsraq_n_s32(int32x4_t a, int32x4_t b, __constrange(1,32) int c); // VSRA.S32 q0,q0,#32 -_NEON2SSE_INLINE int32x4_t vsraq_n_s32(int32x4_t a, int32x4_t b, __constrange(1,32) int c) // VSRA.S32 q0,q0,#32 -{ - int32x4_t shift; - shift = vshrq_n_s32(b, c); - return vaddq_s32(a, shift); -} - -int64x2_t vsraq_n_s64(int64x2_t a, int64x2_t b, __constrange(1,64) int c); // VSRA.S64 q0,q0,#64 -_NEON2SSE_INLINE int64x2_t vsraq_n_s64(int64x2_t a, int64x2_t b, __constrange(1,64) int c) // VSRA.S64 q0,q0,#64 -{ - int64x2_t shift; - shift = vshrq_n_s64(b, c); - return vaddq_s64( a, shift); -} - -uint8x16_t vsraq_n_u8(uint8x16_t a, uint8x16_t b, __constrange(1,8) int c); // VSRA.U8 q0,q0,#8 -_NEON2SSE_INLINE uint8x16_t vsraq_n_u8(uint8x16_t a, uint8x16_t b, __constrange(1,8) int c) // VSRA.U8 q0,q0,#8 -{ - uint8x16_t shift; - shift = vshrq_n_u8(b, c); - return vaddq_u8(a, shift); -} - -uint16x8_t vsraq_n_u16(uint16x8_t a, uint16x8_t b, __constrange(1,16) int c); // VSRA.s16 q0,q0,#16 -_NEON2SSE_INLINE uint16x8_t vsraq_n_u16(uint16x8_t a, uint16x8_t b, __constrange(1,16) int c) // VSRA.s16 q0,q0,#16 -{ - uint16x8_t shift; - shift = vshrq_n_u16(b, c); - return vaddq_u16(a, shift); -} - -uint32x4_t vsraq_n_u32(uint32x4_t a, uint32x4_t b, __constrange(1,32) int c); // VSRA.U32 q0,q0,#32 -_NEON2SSE_INLINE uint32x4_t vsraq_n_u32(uint32x4_t a, uint32x4_t b, __constrange(1,32) int c) // VSRA.U32 q0,q0,#32 -{ - uint32x4_t shift; - shift = vshrq_n_u32(b, c); - return vaddq_u32(a, shift); -} - -uint64x2_t vsraq_n_u64(uint64x2_t a, uint64x2_t b, __constrange(1,64) int c); // VSRA.U64 q0,q0,#64 -_NEON2SSE_INLINE uint64x2_t vsraq_n_u64(uint64x2_t a, uint64x2_t b, __constrange(1,64) int c) // VSRA.U64 q0,q0,#64 -{ - uint64x2_t shift; - shift = vshrq_n_u64(b, c); - return vaddq_u64(a, shift); -} - -//************* Vector rounding shift right by constant and accumulate **************************** -//************************************************************************************************ -int8x8_t vrsra_n_s8(int8x8_t a, int8x8_t b, __constrange(1,8) int c); // VRSRA.S8 d0,d0,#8 -_NEON2SSE_INLINE int8x8_t vrsra_n_s8(int8x8_t a, int8x8_t b, __constrange(1,8) int c) // VRSRA.S8 d0,d0,#8 -{ - int8x8_t shift; - shift = vrshr_n_s8(b, c); - return vadd_s8( a, shift); -} - -int16x4_t vrsra_n_s16(int16x4_t a, int16x4_t b, __constrange(1,16) int c); // VRSRA.S16 d0,d0,#16 -_NEON2SSE_INLINE int16x4_t vrsra_n_s16(int16x4_t a, int16x4_t b, __constrange(1,16) int c) // VRSRA.S16 d0,d0,#16 -{ - int16x4_t shift; - shift = vrshr_n_s16( b, c); - return vadd_s16(a, shift); -} - -int32x2_t vrsra_n_s32(int32x2_t a, int32x2_t b, __constrange(1,32) int c); // VRSRA.S32 d0,d0,#32 -_NEON2SSE_INLINE int32x2_t vrsra_n_s32(int32x2_t a, int32x2_t b, __constrange(1,32) int c) // VRSRA.S32 d0,d0,#32 -{ - //may be not optimal compared with the serial execution - int32x2_t shift; - shift = vrshr_n_s32(b, c); - return vadd_s32( a, shift); -} - -int64x1_t vrsra_n_s64(int64x1_t a, int64x1_t b, __constrange(1,64) int c); // VRSRA.S64 d0,d0,#64 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x1_t vrsra_n_s64(int64x1_t a, int64x1_t b, __constrange(1,64) int c), _NEON2SSE_REASON_SLOW_SERIAL) //serial solution -{ - int64x1_t shift; - shift = vrshr_n_s64(b, c); - return vadd_s64( a, shift); -} - -uint8x8_t vrsra_n_u8(uint8x8_t a, uint8x8_t b, __constrange(1,8) int c); // VRSRA.U8 d0,d0,#8 -_NEON2SSE_INLINE uint8x8_t vrsra_n_u8(uint8x8_t a, uint8x8_t b, __constrange(1,8) int c) // VRSRA.U8 d0,d0,#8 -{ - uint8x8_t shift; - shift = vrshr_n_u8(b, c); - return vadd_u8(a, shift); -} - -uint16x4_t vrsra_n_u16(uint16x4_t a, uint16x4_t b, __constrange(1,16) int c); // VRSRA.s16 d0,d0,#16 -_NEON2SSE_INLINE uint16x4_t vrsra_n_u16(uint16x4_t a, uint16x4_t b, __constrange(1,16) int c) // VRSRA.s16 d0,d0,#16 -{ - uint16x4_t shift; - shift = vrshr_n_u16(b, c); - return vadd_u16(a,shift); -} - -uint32x2_t vrsra_n_u32(uint32x2_t a, uint32x2_t b, __constrange(1,32) int c); // VRSRA.U32 d0,d0,#32 -_NEON2SSE_INLINE uint32x2_t vrsra_n_u32(uint32x2_t a, uint32x2_t b, __constrange(1,32) int c) // VRSRA.U32 d0,d0,#32 -{ - //may be not optimal compared with the serial execution - uint32x2_t shift; - shift = vrshr_n_u32(b, c); - return vadd_u32( a, shift); -} - -uint64x1_t vrsra_n_u64(uint64x1_t a, uint64x1_t b, __constrange(1,64) int c); // VRSRA.U64 d0,d0,#64 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x1_t vrsra_n_u64(uint64x1_t a, uint64x1_t b, __constrange(1,64) int c), _NEON2SSE_REASON_SLOW_SERIAL) //serial solution -{ - //may be not optimal compared with the serial execution - uint64x1_t shift; - shift = vrshr_n_u64(b, c); - return vadd_u64( a, shift); -} - -int8x16_t vrsraq_n_s8(int8x16_t a, int8x16_t b, __constrange(1,8) int c); // VRSRA.S8 q0,q0,#8 -_NEON2SSE_INLINE int8x16_t vrsraq_n_s8(int8x16_t a, int8x16_t b, __constrange(1,8) int c) // VRSRA.S8 q0,q0,#8 -{ - int8x16_t shift; - shift = vrshrq_n_s8(b, c); - return vaddq_s8(a, shift); -} - -int16x8_t vrsraq_n_s16(int16x8_t a, int16x8_t b, __constrange(1,16) int c); // VRSRA.S16 q0,q0,#16 -_NEON2SSE_INLINE int16x8_t vrsraq_n_s16(int16x8_t a, int16x8_t b, __constrange(1,16) int c) // VRSRA.S16 q0,q0,#16 -{ - int16x8_t shift; - shift = vrshrq_n_s16(b, c); - return vaddq_s16(a, shift); -} - -int32x4_t vrsraq_n_s32(int32x4_t a, int32x4_t b, __constrange(1,32) int c); // VRSRA.S32 q0,q0,#32 -_NEON2SSE_INLINE int32x4_t vrsraq_n_s32(int32x4_t a, int32x4_t b, __constrange(1,32) int c) // VRSRA.S32 q0,q0,#32 -{ - int32x4_t shift; - shift = vrshrq_n_s32(b, c); - return vaddq_s32(a, shift); -} - -int64x2_t vrsraq_n_s64(int64x2_t a, int64x2_t b, __constrange(1,64) int c); // VRSRA.S64 q0,q0,#64 -_NEON2SSE_INLINE int64x2_t vrsraq_n_s64(int64x2_t a, int64x2_t b, __constrange(1,64) int c) -{ - int64x2_t shift; - shift = vrshrq_n_s64(b, c); - return vaddq_s64(a, shift); -} - -uint8x16_t vrsraq_n_u8(uint8x16_t a, uint8x16_t b, __constrange(1,8) int c); // VRSRA.U8 q0,q0,#8 -_NEON2SSE_INLINE uint8x16_t vrsraq_n_u8(uint8x16_t a, uint8x16_t b, __constrange(1,8) int c) // VRSRA.U8 q0,q0,#8 -{ - uint8x16_t shift; - shift = vrshrq_n_u8(b, c); - return vaddq_u8(a, shift); -} - -uint16x8_t vrsraq_n_u16(uint16x8_t a, uint16x8_t b, __constrange(1,16) int c); // VRSRA.s16 q0,q0,#16 -_NEON2SSE_INLINE uint16x8_t vrsraq_n_u16(uint16x8_t a, uint16x8_t b, __constrange(1,16) int c) // VRSRA.s16 q0,q0,#16 -{ - uint16x8_t shift; - shift = vrshrq_n_u16(b, c); - return vaddq_u16(a, shift); -} - -uint32x4_t vrsraq_n_u32(uint32x4_t a, uint32x4_t b, __constrange(1,32) int c); // VRSRA.U32 q0,q0,#32 -_NEON2SSE_INLINE uint32x4_t vrsraq_n_u32(uint32x4_t a, uint32x4_t b, __constrange(1,32) int c) // VRSRA.U32 q0,q0,#32 -{ - uint32x4_t shift; - shift = vrshrq_n_u32(b, c); - return vaddq_u32(a, shift); -} - -uint64x2_t vrsraq_n_u64(uint64x2_t a, uint64x2_t b, __constrange(1,64) int c); // VRSRA.U64 q0,q0,#64 -_NEON2SSE_INLINE uint64x2_t vrsraq_n_u64(uint64x2_t a, uint64x2_t b, __constrange(1,64) int c) -{ - uint64x2_t shift; - shift = vrshrq_n_u64(b, c); - return vaddq_u64(a, shift); -} - -//**********************Vector saturating shift left by constant ***************************** -//******************************************************************************************** -//we don't check const ranges assuming they are met -int8x8_t vqshl_n_s8(int8x8_t a, __constrange(0,7) int b); // VQSHL.S8 d0,d0,#0 -_NEON2SSE_INLINE int8x8_t vqshl_n_s8(int8x8_t a, __constrange(0,7) int b) // VQSHL.S8 d0,d0,#0 -{ - //no 8 bit shift available in IA32 SIMD, go to 16 bit. It also provides the auto saturation (in packs function) - int8x8_t res64; - __m128i a128, r128; - a128 = _MM_CVTEPI8_EPI16 (_pM128i(a)); //SSE 4.1 - r128 = _mm_slli_epi16 (a128, b); - r128 = _mm_packs_epi16 (r128,r128); //saturated s8, use 64 low bits only - return64(r128); -} - -int16x4_t vqshl_n_s16(int16x4_t a, __constrange(0,15) int b); // VQSHL.S16 d0,d0,#0 -_NEON2SSE_INLINE int16x4_t vqshl_n_s16(int16x4_t a, __constrange(0,15) int b) // VQSHL.S16 d0,d0,#0 -{ - // go to 32 bit to get the auto saturation (in packs function) - int16x4_t res64; - __m128i a128, r128; - a128 = _MM_CVTEPI16_EPI32 (_pM128i(a)); //SSE 4.1 - r128 = _mm_slli_epi32 (a128, b); //shift_res - r128 = _mm_packs_epi32 (r128,r128); //saturated s16, use 64 low bits only - return64(r128); -} - -int32x2_t vqshl_n_s32(int32x2_t a, __constrange(0,31) int b); // VQSHL.S32 d0,d0,#0 -_NEON2SSE_INLINE int32x2_t vqshl_n_s32(int32x2_t a, __constrange(0,31) int b) -{ - //serial execution may be faster - int32x2_t res64; - return64(vqshlq_n_s32 (_pM128i(a), b)); -} - - -int64x1_t vqshl_n_s64(int64x1_t a, __constrange(0,63) int b); // VQSHL.S64 d0,d0,#0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x1_t vqshl_n_s64(int64x1_t a, __constrange(0,63) int b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - // no effective SIMD solution here - int64x1_t res; - int64_t bmask; - int64_t a_i64 = *( int64_t*)&a; - bmask = ( int64_t)1 << (63 - b); //positive - if (a_i64 >= bmask) { - res.m64_i64[0] = ~(_SIGNBIT64); - } else { - res.m64_i64[0] = (a_i64 <= -bmask) ? _SIGNBIT64 : a_i64 << b; - } - return res; -} - - -uint8x8_t vqshl_n_u8(uint8x8_t a, __constrange(0,7) int b); // VQSHL.U8 d0,d0,#0 -_NEON2SSE_INLINE uint8x8_t vqshl_n_u8(uint8x8_t a, __constrange(0,7) int b) // VQSHL.U8 d0,d0,#0 -{ - //no 8 bit shift available in IA32 SIMD, go to 16 bit - uint8x8_t res64; - __m128i a128, r128; - a128 = _MM_CVTEPU8_EPI16 (_pM128i(a)); //SSE 4.1 - r128 = _mm_slli_epi16 (a128, b); //shift_res - r128 = _mm_packus_epi16 (r128,r128); //saturated u8, use 64 low bits only - return64(r128); -} - -uint16x4_t vqshl_n_u16(uint16x4_t a, __constrange(0,15) int b); // VQSHL.s16 d0,d0,#0 -_NEON2SSE_INLINE uint16x4_t vqshl_n_u16(uint16x4_t a, __constrange(0,15) int b) // VQSHL.s16 d0,d0,#0 -{ - // go to 32 bit to get the auto saturation (in packus function) - uint16x4_t res64; - __m128i a128, r128; - a128 = _MM_CVTEPU16_EPI32 (_pM128i(a)); //SSE 4.1 - r128 = _mm_slli_epi32 (a128, b); //shift_res - r128 = _MM_PACKUS1_EPI32 (r128); //saturated s16 - return64(r128); -} - -uint32x2_t vqshl_n_u32(uint32x2_t a, __constrange(0,31) int b); // VQSHL.U32 d0,d0,#0 -_NEON2SSE_INLINE uint32x2_t vqshl_n_u32(uint32x2_t a, __constrange(0,31) int b) -{ - uint32x2_t res64; - return64(vqshlq_n_u32(_pM128i(a), b)); -} - -uint64x1_t vqshl_n_u64(uint64x1_t a, __constrange(0,63) int b); // VQSHL.U64 d0,d0,#0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x1_t vqshl_n_u64(uint64x1_t a, __constrange(0,63) int b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - // no effective SIMD solution here - uint64x1_t res; - uint64_t bmask; - uint64_t a_i64 = *(uint64_t*)&a; - bmask = ( uint64_t)1 << (64 - b); - res.m64_u64[0] = (a_i64 >= bmask)&&(b>0) ? 0xffffffffffffffff : a_i64 << b; //if b=0 we are fine with any a - return res; -} - -int8x16_t vqshlq_n_s8(int8x16_t a, __constrange(0,7) int b); // VQSHL.S8 q0,q0,#0 -_NEON2SSE_INLINE int8x16_t vqshlq_n_s8(int8x16_t a, __constrange(0,7) int b) // VQSHL.S8 q0,q0,#0 -{ - // go to 16 bit to get the auto saturation (in packs function) - __m128i a128, r128_1, r128_2; - a128 = _MM_CVTEPI8_EPI16 (a); //SSE 4.1 - r128_1 = _mm_slli_epi16 (a128, b); - //swap hi and low part of a128 to process the remaining data - a128 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - a128 = _MM_CVTEPI8_EPI16 (a128); - r128_2 = _mm_slli_epi16 (a128, b); - return _mm_packs_epi16 (r128_1, r128_2); //saturated s8 -} - -int16x8_t vqshlq_n_s16(int16x8_t a, __constrange(0,15) int b); // VQSHL.S16 q0,q0,#0 -_NEON2SSE_INLINE int16x8_t vqshlq_n_s16(int16x8_t a, __constrange(0,15) int b) // VQSHL.S16 q0,q0,#0 -{ - // manual saturation solution looks LESS optimal than 32 bits conversion one - // go to 32 bit to get the auto saturation (in packs function) - __m128i a128, r128_1, r128_2; - a128 = _MM_CVTEPI16_EPI32 (a); //SSE 4.1 - r128_1 = _mm_slli_epi32 (a128, b); //shift_res - //swap hi and low part of a128 to process the remaining data - a128 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - a128 = _MM_CVTEPI16_EPI32 (a128); - r128_2 = _mm_slli_epi32 (a128, b); - return _mm_packs_epi32 (r128_1, r128_2); //saturated s16 -} - -int32x4_t vqshlq_n_s32(int32x4_t a, __constrange(0,31) int b); // VQSHL.S32 q0,q0,#0 -_NEON2SSE_INLINE int32x4_t vqshlq_n_s32(int32x4_t a, __constrange(0,31) int b) // VQSHL.S32 q0,q0,#0 -{ - // no 64 bit saturation option available, special tricks necessary - __m128i c1, maskA, saturation_mask, c7ffffff_mask, shift_res, shift_res_mask; - c1 = _mm_cmpeq_epi32(a,a); //0xff..ff - maskA = _mm_srli_epi32(c1, b + 1); //mask for positive numbers (32-b+1) zeros and b-1 ones - saturation_mask = _mm_cmpgt_epi32 (a, maskA); //0xff...ff if we need saturation, 0 otherwise - c7ffffff_mask = _mm_srli_epi32(saturation_mask, 1); //saturated to 0x7f..ff when needed and zeros if not - shift_res = _mm_slli_epi32 (a, b); - shift_res_mask = _mm_andnot_si128(saturation_mask, shift_res); - //result with positive numbers saturated - shift_res = _mm_or_si128 (c7ffffff_mask, shift_res_mask); - //treat negative numbers - maskA = _mm_slli_epi32(c1, 31 - b); //mask for negative numbers b-1 ones and (32-b+1) zeros - saturation_mask = _mm_cmpgt_epi32 (maskA,a); //0xff...ff if we need saturation, 0 otherwise - c7ffffff_mask = _mm_slli_epi32(saturation_mask, 31); //saturated to 0x80..00 when needed and zeros if not - shift_res_mask = _mm_andnot_si128(saturation_mask, shift_res); - return _mm_or_si128 (c7ffffff_mask, shift_res_mask); -} - -int64x2_t vqshlq_n_s64(int64x2_t a, __constrange(0,63) int b); // VQSHL.S64 q0,q0,#0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqshlq_n_s64(int64x2_t a, __constrange(0,63) int b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - // no effective SIMD solution here - _NEON2SSE_ALIGN_16 int64_t atmp[2], res[2]; - int64_t bmask; - int i; - bmask = ( int64_t)1 << (63 - b); //positive - _mm_store_si128((__m128i*)atmp, a); - for (i = 0; i<2; i++) { - if (atmp[i] >= bmask) { - res[i] = ~(_SIGNBIT64); - } else { - res[i] = (atmp[i] <= -bmask) ? _SIGNBIT64 : atmp[i] << b; - } - } - return _mm_load_si128((__m128i*)res); -} - -uint8x16_t vqshlq_n_u8(uint8x16_t a, __constrange(0,7) int b); // VQSHL.U8 q0,q0,#0 -_NEON2SSE_INLINE uint8x16_t vqshlq_n_u8(uint8x16_t a, __constrange(0,7) int b) // VQSHL.U8 q0,q0,#0 -{ - // go to 16 bit to get the auto saturation (in packs function) - __m128i a128, r128_1, r128_2; - a128 = _MM_CVTEPU8_EPI16 (a); //SSE 4.1 - r128_1 = _mm_slli_epi16 (a128, b); - //swap hi and low part of a128 to process the remaining data - a128 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - a128 = _MM_CVTEPU8_EPI16 (a128); - r128_2 = _mm_slli_epi16 (a128, b); - return _mm_packus_epi16 (r128_1, r128_2); //saturated u8 -} - -uint16x8_t vqshlq_n_u16(uint16x8_t a, __constrange(0,15) int b); // VQSHL.s16 q0,q0,#0 -_NEON2SSE_INLINE uint16x8_t vqshlq_n_u16(uint16x8_t a, __constrange(0,15) int b) // VQSHL.s16 q0,q0,#0 -{ - // manual saturation solution looks more optimal than 32 bits conversion one - __m128i cb, c8000, a_signed, saturation_mask, shift_res; - cb = _mm_set1_epi16((1 << (16 - b)) - 1 - 0x8000 ); - c8000 = _mm_set1_epi16 (0x8000); -//no unsigned shorts comparison in SSE, only signed available, so need the trick - a_signed = _mm_sub_epi16(a, c8000); //go to signed - saturation_mask = _mm_cmpgt_epi16 (a_signed, cb); - shift_res = _mm_slli_epi16 (a, b); - return _mm_or_si128 (shift_res, saturation_mask); -} - -uint32x4_t vqshlq_n_u32(uint32x4_t a, __constrange(0,31) int b); // VQSHL.U32 q0,q0,#0 -_NEON2SSE_INLINE uint32x4_t vqshlq_n_u32(uint32x4_t a, __constrange(0,31) int b) // VQSHL.U32 q0,q0,#0 -{ - // manual saturation solution, no 64 bit saturation option, the serial version may be faster - __m128i cb, c80000000, a_signed, saturation_mask, shift_res; - cb = _mm_set1_epi32((1 << (32 - b)) - 1 - 0x80000000 ); - c80000000 = _mm_set1_epi32 (0x80000000); -//no unsigned ints comparison in SSE, only signed available, so need the trick - a_signed = _mm_sub_epi32(a, c80000000); //go to signed - saturation_mask = _mm_cmpgt_epi32 (a_signed, cb); - shift_res = _mm_slli_epi32 (a, b); - return _mm_or_si128 (shift_res, saturation_mask); -} - -uint64x2_t vqshlq_n_u64(uint64x2_t a, __constrange(0,63) int b); // VQSHL.U64 q0,q0,#0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x2_t vqshlq_n_u64(uint64x2_t a, __constrange(0,63) int b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - // no effective SIMD solution here - _NEON2SSE_ALIGN_16 uint64_t atmp[2], res[2]; - uint64_t bmask; - int i; - bmask = ( uint64_t)1 << (64 - b); - _mm_store_si128((__m128i*)atmp, a); - for (i = 0; i<2; i++) { - res[i] = (atmp[i] >= bmask)&&(b>0) ? 0xffffffffffffffff : atmp[i] << b; //if b=0 we are fine with any a - } - return _mm_load_si128((__m128i*)res); -} - -//**************Vector signed->unsigned saturating shift left by constant ************* -//************************************************************************************* -uint8x8_t vqshlu_n_s8(int8x8_t a, __constrange(0,7) int b); // VQSHLU.S8 d0,d0,#0 -_NEON2SSE_INLINE uint8x8_t vqshlu_n_s8(int8x8_t a, __constrange(0,7) int b) // VQSHLU.S8 d0,d0,#0 -{ - //no 8 bit shift available in IA32 SIMD, go to 16 bit. It also provides the auto saturation (in packs function) - uint8x8_t res64; - __m128i a128, r128; - a128 = _MM_CVTEPI8_EPI16 (_pM128i(a)); //SSE 4.1 - r128 = _mm_slli_epi16 (a128, b); - r128 = _mm_packus_epi16 (r128,r128); //saturated u8, use 64 low bits only - return64(r128); -} - -uint16x4_t vqshlu_n_s16(int16x4_t a, __constrange(0,15) int b); // VQSHLU.S16 d0,d0,#0 -_NEON2SSE_INLINE uint16x4_t vqshlu_n_s16(int16x4_t a, __constrange(0,15) int b) // VQSHLU.S16 d0,d0,#0 -{ - uint16x4_t res64; - __m128i a128, r128; - a128 = _MM_CVTEPI16_EPI32 (_pM128i(a)); //SSE 4.1 - r128 = _mm_slli_epi32 (a128, b); //shift_res - r128 = _MM_PACKUS1_EPI32 (r128); //saturated s16, use 64 low bits only - return64(r128); -} - -uint32x2_t vqshlu_n_s32(int32x2_t a, __constrange(0,31) int b); // VQSHLU.S32 d0,d0,#0 -_NEON2SSE_INLINE int32x2_t vqshlu_n_s32(int32x2_t a, __constrange(0,31) int b) -{ - int32x2_t res64; - return64( vqshluq_n_s32(_pM128i(a), b)); -} - -uint64x1_t vqshlu_n_s64(int64x1_t a, __constrange(0,63) int b); // VQSHLU.S64 d0,d0,#0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x1_t vqshlu_n_s64(int64x1_t a, __constrange(0,63) int b), _NEON2SSE_REASON_SLOW_SERIAL) // no effective SIMD solution here, serial execution looks faster -{ - uint64x1_t res; - uint64_t limit; - if (a.m64_i64[0]<=0) { - res.m64_u64[0] = 0; - } else { - limit = (uint64_t) 1 << (64 - b); - res.m64_u64[0] = ( ((uint64_t)a.m64_i64[0]) >= limit) ? res.m64_u64[0] = ~((uint64_t)0) : a.m64_i64[0] << b; - } - return res; -} - -uint8x16_t vqshluq_n_s8(int8x16_t a, __constrange(0,7) int b); // VQSHLU.S8 q0,q0,#0 -_NEON2SSE_INLINE uint8x16_t vqshluq_n_s8(int8x16_t a, __constrange(0,7) int b) // VQSHLU.S8 q0,q0,#0 -{ - __m128i a128, r128_1, r128_2; - a128 = _MM_CVTEPI8_EPI16 (a); //SSE 4.1 - r128_1 = _mm_slli_epi16 (a128, b); - //swap hi and low part of a128 to process the remaining data - a128 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - a128 = _MM_CVTEPI8_EPI16 (a128); - r128_2 = _mm_slli_epi16 (a128, b); - return _mm_packus_epi16 (r128_1, r128_2); //saturated u8 -} - -uint16x8_t vqshluq_n_s16(int16x8_t a, __constrange(0,15) int b); // VQSHLU.S16 q0,q0,#0 -_NEON2SSE_INLINE uint16x8_t vqshluq_n_s16(int16x8_t a, __constrange(0,15) int b) // VQSHLU.S16 q0,q0,#0 -{ - // manual saturation solution looks LESS optimal than 32 bits conversion one - __m128i a128, r128_1, r128_2; - a128 = _MM_CVTEPI16_EPI32 (a); //SSE 4.1 - r128_1 = _mm_slli_epi32 (a128, b); //shift_res - //swap hi and low part of a128 to process the remaining data - a128 = _mm_shuffle_epi32 (a, _SWAP_HI_LOW32); - a128 = _MM_CVTEPI16_EPI32 (a128); - r128_2 = _mm_slli_epi32 (a128, b); - return _MM_PACKUS_EPI32 (r128_1, r128_2); //saturated s16 -} - -uint32x4_t vqshluq_n_s32(int32x4_t a, __constrange(0,31) int b); // VQSHLU.S32 q0,q0,#0 -_NEON2SSE_INLINE uint32x4_t vqshluq_n_s32(int32x4_t a, __constrange(0,31) int b) // VQSHLU.S32 q0,q0,#0 -{ - //solution may be not optimal compared with the serial one - __m128i zero, maskA, maskGT0, a0, a_masked, a_shift; - zero = _mm_setzero_si128(); - maskA = _mm_cmpeq_epi32(a, a); - maskA = _mm_slli_epi32(maskA,(32 - b)); // b ones and (32-b)zeros - //saturate negative numbers to zero - maskGT0 = _mm_cmpgt_epi32 (a, zero); // //0xffffffff if positive number and zero otherwise (negative numbers) - a0 = _mm_and_si128 (a, maskGT0); //negative are zeros now - //saturate positive to 0xffffffff - a_masked = _mm_and_si128 (a0, maskA); - a_masked = _mm_cmpgt_epi32 (a_masked, zero); //0xffffffff if saturation necessary 0 otherwise - a_shift = _mm_slli_epi32 (a0, b); - return _mm_or_si128 (a_shift, a_masked); //actual saturation -} - -uint64x2_t vqshluq_n_s64(int64x2_t a, __constrange(0,63) int b); // VQSHLU.S64 q0,q0,#0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint64x2_t vqshluq_n_s64(int64x2_t a, __constrange(0,63) int b), _NEON2SSE_REASON_SLOW_SERIAL) -{ - // no effective SIMD solution here, serial execution looks faster - _NEON2SSE_ALIGN_16 int64_t atmp[2]; - _NEON2SSE_ALIGN_16 uint64_t res[2]; - uint64_t limit; - int i; - _mm_store_si128((__m128i*)atmp, a); - for (i = 0; i<2; i++) { - if (atmp[i]<=0) { - res[i] = 0; - } else { - limit = (uint64_t) 1 << (64 - b); - res[i] = ( ((uint64_t)atmp[i]) >= limit) ? res[i] = ~((uint64_t)0) : atmp[i] << b; - } - } - return _mm_load_si128((__m128i*)res); -} - -//************** Vector narrowing shift right by constant ************** -//********************************************************************** -int8x8_t vshrn_n_s16(int16x8_t a, __constrange(1,8) int b); // VSHRN.I16 d0,q0,#8 -_NEON2SSE_INLINE int8x8_t vshrn_n_s16(int16x8_t a, __constrange(1,8) int b) // VSHRN.I16 d0,q0,#8 -{ - int8x8_t res64; - __m128i r16; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - r16 = vshrq_n_s16(a,b); - r16 = _mm_shuffle_epi8 (r16, *(__m128i*) mask8_16_even_odd); //narrow, use low 64 bits only. Impossible to use _mm_packs because of negative saturation problems - return64(r16); -} - -int16x4_t vshrn_n_s32(int32x4_t a, __constrange(1,16) int b); // VSHRN.I32 d0,q0,#16 -_NEON2SSE_INLINE int16x4_t vshrn_n_s32(int32x4_t a, __constrange(1,16) int b) // VSHRN.I32 d0,q0,#16 -{ - int16x4_t res64; - __m128i r32; - _NEON2SSE_ALIGN_16 int8_t mask16_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - r32 = vshrq_n_s32(a,b); - r32 = _mm_shuffle_epi8 (r32, *(__m128i*) mask16_odd); //narrow, use low 64 bits only. Impossible to use _mm_packs because of negative saturation problems - return64(r32); -} - -int32x2_t vshrn_n_s64(int64x2_t a, __constrange(1,32) int b); // VSHRN.I64 d0,q0,#32 -_NEON2SSE_INLINE int32x2_t vshrn_n_s64(int64x2_t a, __constrange(1,32) int b) -{ - int32x2_t res64; - __m128i r64; - r64 = vshrq_n_s64(a,b); - r64 = _mm_shuffle_epi32(r64, 0 | (2 << 2) | (1 << 4) | (3 << 6)); //shuffle the data to get 2 32-bits - return64(r64); -} - -uint8x8_t vshrn_n_u16(uint16x8_t a, __constrange(1,8) int b); // VSHRN.I16 d0,q0,#8 -_NEON2SSE_INLINE uint8x8_t vshrn_n_u16(uint16x8_t a, __constrange(1,8) int b) // VSHRN.I16 d0,q0,#8 -{ - uint8x8_t res64; - __m128i mask, r16; - mask = _mm_set1_epi16(0xff); - r16 = vshrq_n_s16(a,b); //after right shift b>=1 unsigned var fits into signed range, so we could use _mm_packus_epi16 (signed 16 to unsigned 8) - r16 = _mm_and_si128(r16, mask); //to avoid saturation - r16 = _mm_packus_epi16 (r16,r16); //narrow, use low 64 bits only - return64(r16); -} - -uint16x4_t vshrn_n_u32(uint32x4_t a, __constrange(1,16) int b); // VSHRN.I32 d0,q0,#16 -_NEON2SSE_INLINE uint16x4_t vshrn_n_u32(uint32x4_t a, __constrange(1,16) int b) // VSHRN.I32 d0,q0,#16 -{ - uint16x4_t res64; - __m128i mask, r32; - mask = _mm_set1_epi32(0xffff); - r32 = vshrq_n_u32(a,b); //after right shift b>=1 unsigned var fits into signed range, so we could use _MM_PACKUS_EPI32 (signed 32 to unsigned 16) - r32 = _mm_and_si128(r32, mask); //to avoid saturation - r32 = _MM_PACKUS1_EPI32 (r32); //saturate and narrow, use low 64 bits only - return64(r32); -} - -uint32x2_t vshrn_n_u64(uint64x2_t a, __constrange(1,32) int b); // VSHRN.I64 d0,q0,#32 -_NEON2SSE_INLINE uint32x2_t vshrn_n_u64(uint64x2_t a, __constrange(1,32) int b) -{ - uint32x2_t res64; - __m128i r64; - r64 = vshrq_n_u64(a,b); - r64 = _mm_shuffle_epi32(r64, 0 | (2 << 2) | (1 << 4) | (3 << 6)); //shuffle the data to get 2 32-bits - return64(r64); -} - -//************** Vector signed->unsigned narrowing saturating shift right by constant ******** -//********************************************************************************************* -uint8x8_t vqshrun_n_s16(int16x8_t a, __constrange(1,8) int b); // VQSHRUN.S16 d0,q0,#8 -_NEON2SSE_INLINE uint8x8_t vqshrun_n_s16(int16x8_t a, __constrange(1,8) int b) // VQSHRUN.S16 d0,q0,#8 -{ - uint8x8_t res64; - __m128i r16; - r16 = vshrq_n_s16(a,b); - r16 = _mm_packus_epi16 (r16,r16); //saturate and narrow (signed to unsigned), use low 64 bits only - return64(r16); -} - -uint16x4_t vqshrun_n_s32(int32x4_t a, __constrange(1,16) int b); // VQSHRUN.S32 d0,q0,#16 -_NEON2SSE_INLINE uint16x4_t vqshrun_n_s32(int32x4_t a, __constrange(1,16) int b) // VQSHRUN.S32 d0,q0,#16 -{ - uint16x4_t res64; - __m128i r32; - r32 = vshrq_n_s32(a,b); - r32 = _MM_PACKUS1_EPI32 (r32); //saturate and narrow(signed to unsigned), use low 64 bits only - return64(r32); -} - -uint32x2_t vqshrun_n_s64(int64x2_t a, __constrange(1,32) int b); // VQSHRUN.S64 d0,q0,#32 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vqshrun_n_s64(int64x2_t a, __constrange(1,32) int b), _NEON2SSE_REASON_SLOW_SERIAL) //serial solution is faster -{ - _NEON2SSE_ALIGN_16 int64_t atmp[2]; - uint32x2_t res; - int64_t res64; - _mm_store_si128((__m128i*)atmp, a); - if (atmp[0] < 0) { - res.m64_u32[0] = 0; - } else { - res64 = (atmp[0] >> b); - res.m64_u32[0] = (res64 > (int64_t)0xffffffff) ? 0xffffffff : (uint32_t) res64; - } - if (atmp[1] < 0) { - res.m64_u32[1] = 0; - } else { - res64 = (atmp[1] >> b); - res.m64_u32[1] = (res64 > (int64_t)0xffffffff) ? 0xffffffff : (uint32_t)res64; - } - return res; -} - -//**** Vector signed->unsigned rounding narrowing saturating shift right by constant ***** -uint8x8_t vqrshrun_n_s16(int16x8_t a, __constrange(1,8) int b); // VQRSHRUN.S16 d0,q0,#8 -_NEON2SSE_INLINE uint8x8_t vqrshrun_n_s16(int16x8_t a, __constrange(1,8) int b) // VQRSHRUN.S16 d0,q0,#8 -{ - //solution may be not optimal compared with the serial one - __m128i r16; - uint8x8_t res64; - r16 = vrshrq_n_s16(a,b); - r16 = _mm_packus_epi16 (r16,r16); //saturate and narrow (signed to unsigned), use low 64 bits only - return64(r16); -} - -uint16x4_t vqrshrun_n_s32(int32x4_t a, __constrange(1,16) int b); // VQRSHRUN.S32 d0,q0,#16 -_NEON2SSE_INLINE uint16x4_t vqrshrun_n_s32(int32x4_t a, __constrange(1,16) int b) // VQRSHRUN.S32 d0,q0,#16 -{ - //solution may be not optimal compared with the serial one - __m128i r32; - uint16x4_t res64; - r32 = vrshrq_n_s32(a,b); - r32 = _MM_PACKUS1_EPI32 (r32); //saturate and narrow (signed to unsigned), use low 64 bits only - return64(r32); -} - -uint32x2_t vqrshrun_n_s64(int64x2_t a, __constrange(1,32) int b); // VQRSHRUN.S64 d0,q0,#32 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vqrshrun_n_s64(int64x2_t a, __constrange(1,32) int b), _NEON2SSE_REASON_SLOW_SERIAL) //serial solution is faster -{ - _NEON2SSE_ALIGN_16 int64_t atmp[2]; - uint32x2_t res; - int64_t res64; - _mm_store_si128((__m128i*)atmp, a); - if (atmp[0] < 0) { - res.m64_u32[0] = 0; - } else { - res64 = (atmp[0] >> b) + ( (atmp[0] & ((int64_t)1 << (b - 1))) >> (b - 1) ); - res.m64_u32[0] = (res64 > (int64_t)0xffffffff ) ? 0xffffffff : res64; - } - if (atmp[1] < 0) { - res.m64_u32[1] = 0; - } else { - res64 = (atmp[1] >> b) + ( (atmp[0] & ((int64_t)1 << (b - 1))) >> (b - 1) ); - res.m64_u32[1] = (res64 > (int64_t)0xffffffff ) ? 0xffffffff : res64; - } - return res; -} - -//***** Vector narrowing saturating shift right by constant ****** -//***************************************************************** -int8x8_t vqshrn_n_s16(int16x8_t a, __constrange(1,8) int b); // VQSHRN.S16 d0,q0,#8 -_NEON2SSE_INLINE int8x8_t vqshrn_n_s16(int16x8_t a, __constrange(1,8) int b) // VQSHRN.S16 d0,q0,#8 -{ - int8x8_t res64; - __m128i r16; - r16 = vshrq_n_s16(a,b); - r16 = _mm_packs_epi16 (r16,r16); //saturate and narrow, use low 64 bits only - return64(r16); -} - -int16x4_t vqshrn_n_s32(int32x4_t a, __constrange(1,16) int b); // VQSHRN.S32 d0,q0,#16 -_NEON2SSE_INLINE int16x4_t vqshrn_n_s32(int32x4_t a, __constrange(1,16) int b) // VQSHRN.S32 d0,q0,#16 -{ - int16x4_t res64; - __m128i r32; - r32 = vshrq_n_s32(a,b); - r32 = _mm_packs_epi32 (r32,r32); //saturate and narrow, use low 64 bits only - return64(r32); -} - -int32x2_t vqshrn_n_s64(int64x2_t a, __constrange(1,32) int b); // VQSHRN.S64 d0,q0,#32 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vqshrn_n_s64(int64x2_t a, __constrange(1,32) int b), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - //no optimal SIMD solution found - _NEON2SSE_ALIGN_16 int64_t res64[2], atmp[2]; - int32x2_t res; - _mm_store_si128((__m128i*)atmp, a); - res64[0] = (atmp[0] >> b); - res64[1] = (atmp[1] >> b); - if(res64[0]>SINT_MAX) res64[0] = SINT_MAX; - if(res64[0]<SINT_MIN) res64[0] = SINT_MIN; - if(res64[1]>SINT_MAX) res64[1] = SINT_MAX; - if(res64[1]<SINT_MIN) res64[1] = SINT_MIN; - res.m64_i32[0] = (int32_t)res64[0]; - res.m64_i32[1] = (int32_t)res64[1]; - return res; -} - -uint8x8_t vqshrn_n_u16(uint16x8_t a, __constrange(1,8) int b); // VQSHRN.s16 d0,q0,#8 -_NEON2SSE_INLINE uint8x8_t vqshrn_n_u16(uint16x8_t a, __constrange(1,8) int b) // VQSHRN.s16 d0,q0,#8 -{ - uint8x8_t res64; - __m128i r16; - r16 = vshrq_n_u16(a,b); //after right shift b>=1 unsigned var fits into signed range, so we could use _mm_packus_epi16 (signed 16 to unsigned 8) - r16 = _mm_packus_epi16 (r16,r16); //saturate and narrow, use low 64 bits only - return64(r16); -} - -uint16x4_t vqshrn_n_u32(uint32x4_t a, __constrange(1,16) int b); // VQSHRN.U32 d0,q0,#16 -_NEON2SSE_INLINE uint16x4_t vqshrn_n_u32(uint32x4_t a, __constrange(1,16) int b) // VQSHRN.U32 d0,q0,#16 -{ - uint16x4_t res64; - __m128i r32; - r32 = vshrq_n_u32(a,b); //after right shift b>=1 unsigned var fits into signed range, so we could use _MM_PACKUS_EPI32 (signed 32 to unsigned 8) - r32 = _MM_PACKUS1_EPI32 (r32); //saturate and narrow, use low 64 bits only - return64(r32); -} - -uint32x2_t vqshrn_n_u64(uint64x2_t a, __constrange(1,32) int b); // VQSHRN.U64 d0,q0,#32 -_NEON2SSE_INLINE uint32x2_t vqshrn_n_u64(uint64x2_t a, __constrange(1,32) int b) -{ - //serial solution may be faster - uint32x2_t res64; - __m128i r64, res_hi, zero; - zero = _mm_setzero_si128(); - r64 = vshrq_n_u64(a,b); - res_hi = _mm_srli_epi64(r64, 32); - res_hi = _mm_cmpgt_epi32(res_hi, zero); - r64 = _mm_or_si128(r64, res_hi); - r64 = _mm_shuffle_epi32(r64, 0 | (2 << 2) | (1 << 4) | (3 << 6)); //shuffle the data to get 2 32-bits - return64(r64); -} - - -//********* Vector rounding narrowing shift right by constant ************************* -//**************************************************************************************** -int8x8_t vrshrn_n_s16(int16x8_t a, __constrange(1,8) int b); // VRSHRN.I16 d0,q0,#8 -_NEON2SSE_INLINE int8x8_t vrshrn_n_s16(int16x8_t a, __constrange(1,8) int b) // VRSHRN.I16 d0,q0,#8 -{ - int8x8_t res64; - __m128i r16; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - r16 = vrshrq_n_s16(a,b); - r16 = _mm_shuffle_epi8 (r16, *(__m128i*) mask8_16_even_odd); //narrow, use low 64 bits only. Impossible to use _mm_packs because of negative saturation problems - return64(r16); -} - -int16x4_t vrshrn_n_s32(int32x4_t a, __constrange(1,16) int b); // VRSHRN.I32 d0,q0,#16 -_NEON2SSE_INLINE int16x4_t vrshrn_n_s32(int32x4_t a, __constrange(1,16) int b) // VRSHRN.I32 d0,q0,#16 -{ - int16x4_t res64; - __m128i r32; - _NEON2SSE_ALIGN_16 int8_t mask16_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - r32 = vrshrq_n_s32(a,b); - r32 = _mm_shuffle_epi8 (r32, *(__m128i*) mask16_odd); //narrow, use low 64 bits only. Impossible to use _mm_packs because of negative saturation problems - return64(r32); -} - -int32x2_t vrshrn_n_s64(int64x2_t a, __constrange(1,32) int b); // VRSHRN.I64 d0,q0,#32 -_NEON2SSE_INLINE int32x2_t vrshrn_n_s64(int64x2_t a, __constrange(1,32) int b) -{ - int32x2_t res64; - __m128i r64; - r64 = vrshrq_n_s64(a,b); - r64 = _mm_shuffle_epi32(r64, 0 | (2 << 2) | (1 << 4) | (3 << 6)); //shuffle the data to get 2 32-bits - return64(r64); -} - -uint8x8_t vrshrn_n_u16(uint16x8_t a, __constrange(1,8) int b); // VRSHRN.I16 d0,q0,#8 -_NEON2SSE_INLINE uint8x8_t vrshrn_n_u16(uint16x8_t a, __constrange(1,8) int b) // VRSHRN.I16 d0,q0,#8 -{ - uint8x8_t res64; - __m128i mask, r16; - mask = _mm_set1_epi16(0xff); - r16 = vrshrq_n_s16(a,b); //after right shift b>=1 unsigned var fits into signed range, so we could use _mm_packus_epi16 (signed 16 to unsigned 8) - r16 = _mm_and_si128(r16, mask); //to avoid saturation - r16 = _mm_packus_epi16 (r16,r16); //saturate and narrow, use low 64 bits only - return64(r16); -} - -uint16x4_t vrshrn_n_u32(uint32x4_t a, __constrange(1,16) int b); // VRSHRN.I32 d0,q0,#16 -_NEON2SSE_INLINE uint16x4_t vrshrn_n_u32(uint32x4_t a, __constrange(1,16) int b) // VRSHRN.I32 d0,q0,#16 -{ - uint16x4_t res64; - __m128i mask, r32; - mask = _mm_set1_epi32(0xffff); - r32 = vrshrq_n_u32(a,b); //after right shift b>=1 unsigned var fits into signed range, so we could use _MM_PACKUS_EPI32 (signed 32 to unsigned 8) - r32 = _mm_and_si128(r32, mask); //to avoid saturation - r32 = _MM_PACKUS1_EPI32 (r32); //saturate and narrow, use low 64 bits only - return64(r32); -} - -uint32x2_t vrshrn_n_u64(uint64x2_t a, __constrange(1,32) int b); // VRSHRN.I64 d0,q0,#32 -_NEON2SSE_INLINE uint32x2_t vrshrn_n_u64(uint64x2_t a, __constrange(1,32) int b) //serial solution may be faster -{ - uint32x2_t res64; - __m128i r64; - r64 = vrshrq_n_u64(a,b); - r64 = _mm_shuffle_epi32(r64, 0 | (2 << 2) | (1 << 4) | (3 << 6)); //shuffle the data to get 2 32-bits - return64(r64); -} - -//************* Vector rounding narrowing saturating shift right by constant ************ -//**************************************************************************************** -int8x8_t vqrshrn_n_s16(int16x8_t a, __constrange(1,8) int b); // VQRSHRN.S16 d0,q0,#8 -_NEON2SSE_INLINE int8x8_t vqrshrn_n_s16(int16x8_t a, __constrange(1,8) int b) // VQRSHRN.S16 d0,q0,#8 -{ - int8x8_t res64; - __m128i r16; - r16 = vrshrq_n_s16(a,b); - r16 = _mm_packs_epi16 (r16,r16); //saturate and narrow, use low 64 bits only - return64(r16); -} - -int16x4_t vqrshrn_n_s32(int32x4_t a, __constrange(1,16) int b); // VQRSHRN.S32 d0,q0,#16 -_NEON2SSE_INLINE int16x4_t vqrshrn_n_s32(int32x4_t a, __constrange(1,16) int b) // VQRSHRN.S32 d0,q0,#16 -{ - int16x4_t res64; - __m128i r32; - r32 = vrshrq_n_s32(a,b); - r32 = _mm_packs_epi32 (r32,r32); //saturate and narrow, use low 64 bits only - return64(r32); -} - -int32x2_t vqrshrn_n_s64(int64x2_t a, __constrange(1,32) int b); // VQRSHRN.S64 d0,q0,#32 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vqrshrn_n_s64(int64x2_t a, __constrange(1,32) int b), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - //no optimal SIMD solution found - _NEON2SSE_ALIGN_16 int64_t res64[2], atmp[2], maskb[2]; - int32x2_t res; - _mm_store_si128((__m128i*)atmp, a); - maskb[0] = atmp[0] & (( int64_t)1 << (b - 1)); - res64[0] = (atmp[0] >> b) + (maskb[0] >> (b - 1)); //rounded result - maskb[1] = atmp[1] & (( int64_t)1 << (b - 1)); - res64[1] = (atmp[1] >> b) + (maskb[1] >> (b - 1)); //rounded result - if(res64[0]>SINT_MAX) res64[0] = SINT_MAX; - if(res64[0]<SINT_MIN) res64[0] = SINT_MIN; - if(res64[1]>SINT_MAX) res64[1] = SINT_MAX; - if(res64[1]<SINT_MIN) res64[1] = SINT_MIN; - res.m64_i32[0] = (int32_t)res64[0]; - res.m64_i32[1] = (int32_t)res64[1]; - return res; -} - -uint8x8_t vqrshrn_n_u16(uint16x8_t a, __constrange(1,8) int b); // VQRSHRN.s16 d0,q0,#8 -_NEON2SSE_INLINE uint8x8_t vqrshrn_n_u16(uint16x8_t a, __constrange(1,8) int b) // VQRSHRN.s16 d0,q0,#8 -{ - uint8x8_t res64; - __m128i r16; - r16 = vrshrq_n_u16(a,b); //after right shift b>=1 unsigned var fits into signed range, so we could use _mm_packus_epi16 (signed 16 to unsigned 8) - r16 = _mm_packus_epi16 (r16,r16); //saturate and narrow, use low 64 bits only - return64(r16); -} - -uint16x4_t vqrshrn_n_u32(uint32x4_t a, __constrange(1,16) int b); // VQRSHRN.U32 d0,q0,#16 -_NEON2SSE_INLINE uint16x4_t vqrshrn_n_u32(uint32x4_t a, __constrange(1,16) int b) // VQRSHRN.U32 d0,q0,#16 -{ - uint16x4_t res64; - __m128i r32; - r32 = vrshrq_n_u32(a,b); //after right shift b>=1 unsigned var fits into signed range, so we could use _MM_PACKUS_EPI32 (signed 32 to unsigned 8) - r32 = _MM_PACKUS1_EPI32 (r32); //saturate and narrow, use low 64 bits only - return64(r32); -} - -uint32x2_t vqrshrn_n_u64(uint64x2_t a, __constrange(1,32) int b); // VQRSHRN.U64 d0,q0,#32 -_NEON2SSE_INLINE uint32x2_t vqrshrn_n_u64(uint64x2_t a, __constrange(1,32) int b) -{ - //serial solution may be faster - uint32x2_t res64; - __m128i r64, res_hi, zero; - zero = _mm_setzero_si128(); - r64 = vrshrq_n_u64(a,b); - res_hi = _mm_srli_epi64(r64, 32); - res_hi = _mm_cmpgt_epi32(res_hi, zero); - r64 = _mm_or_si128(r64, res_hi); - r64 = _mm_shuffle_epi32(r64, 0 | (2 << 2) | (1 << 4) | (3 << 6)); //shuffle the data to get 2 32-bits - return64(r64); -} - -//************** Vector widening shift left by constant **************** -//************************************************************************ -int16x8_t vshll_n_s8(int8x8_t a, __constrange(0,8) int b); // VSHLL.S8 q0,d0,#0 -_NEON2SSE_INLINE int16x8_t vshll_n_s8(int8x8_t a, __constrange(0,8) int b) // VSHLL.S8 q0,d0,#0 -{ - __m128i r; - r = _MM_CVTEPI8_EPI16 (_pM128i(a)); //SSE 4.1 - return _mm_slli_epi16 (r, b); -} - -int32x4_t vshll_n_s16(int16x4_t a, __constrange(0,16) int b); // VSHLL.S16 q0,d0,#0 -_NEON2SSE_INLINE int32x4_t vshll_n_s16(int16x4_t a, __constrange(0,16) int b) // VSHLL.S16 q0,d0,#0 -{ - __m128i r; - r = _MM_CVTEPI16_EPI32(_pM128i(a)); //SSE4.1, - return _mm_slli_epi32 (r, b); -} - -int64x2_t vshll_n_s32(int32x2_t a, __constrange(0,32) int b); // VSHLL.S32 q0,d0,#0 -_NEON2SSE_INLINE int64x2_t vshll_n_s32(int32x2_t a, __constrange(0,32) int b) // VSHLL.S32 q0,d0,#0 -{ - __m128i r; - r = _MM_CVTEPI32_EPI64(_pM128i(a)); //SSE4.1, - return _mm_slli_epi64 (r, b); -} - -uint16x8_t vshll_n_u8(uint8x8_t a, __constrange(0,8) int b); // VSHLL.U8 q0,d0,#0 -_NEON2SSE_INLINE uint16x8_t vshll_n_u8(uint8x8_t a, __constrange(0,8) int b) // VSHLL.U8 q0,d0,#0 -{ - //no uint8 to uint16 conversion available, manual conversion used - __m128i zero, r; - zero = _mm_setzero_si128 (); - r = _mm_unpacklo_epi8(_pM128i(a), zero); - return _mm_slli_epi16 (r, b); -} - -uint32x4_t vshll_n_u16(uint16x4_t a, __constrange(0,16) int b); // VSHLL.s16 q0,d0,#0 -_NEON2SSE_INLINE uint32x4_t vshll_n_u16(uint16x4_t a, __constrange(0,16) int b) // VSHLL.s16 q0,d0,#0 -{ - //no uint16 to uint32 conversion available, manual conversion used - __m128i zero, r; - zero = _mm_setzero_si128 (); - r = _mm_unpacklo_epi16(_pM128i(a), zero); - return _mm_slli_epi32 (r, b); -} - -uint64x2_t vshll_n_u32(uint32x2_t a, __constrange(0,32) int b); // VSHLL.U32 q0,d0,#0 -_NEON2SSE_INLINE uint64x2_t vshll_n_u32(uint32x2_t a, __constrange(0,32) int b) // VSHLL.U32 q0,d0,#0 -{ - //no uint32 to uint64 conversion available, manual conversion used - __m128i zero, r; - zero = _mm_setzero_si128 (); - r = _mm_unpacklo_epi32(_pM128i(a), zero); - return _mm_slli_epi64 (r, b); -} - -//************************************************************************************ -//**************************** Shifts with insert ************************************ -//************************************************************************************ -//takes each element in a vector, shifts them by an immediate value, -//and inserts the results in the destination vector. Bits shifted out of the each element are lost. - -//**************** Vector shift right and insert ************************************ -//Actually the "c" left bits from "a" are the only bits remained from "a" after the shift. -//All other bits are taken from b shifted. -int8x8_t vsri_n_s8(int8x8_t a, int8x8_t b, __constrange(1,8) int c); // VSRI.8 d0,d0,#8 -_NEON2SSE_INLINE int8x8_t vsri_n_s8(int8x8_t a, int8x8_t b, __constrange(1,8) int c) -{ - int8x8_t res64; - return64(vsriq_n_s8(_pM128i(a),_pM128i(b), c)); -} - - -int16x4_t vsri_n_s16(int16x4_t a, int16x4_t b, __constrange(1,16) int c); // VSRI.16 d0,d0,#16 -_NEON2SSE_INLINE int16x4_t vsri_n_s16(int16x4_t a, int16x4_t b, __constrange(1,16) int c) -{ - int16x4_t res64; - return64(vsriq_n_s16(_pM128i(a),_pM128i(b), c)); -} - - -int32x2_t vsri_n_s32(int32x2_t a, int32x2_t b, __constrange(1,32) int c); // VSRI.32 d0,d0,#32 -_NEON2SSE_INLINE int32x2_t vsri_n_s32(int32x2_t a, int32x2_t b, __constrange(1,32) int c) -{ - int32x2_t res64; - return64(vsriq_n_s32(_pM128i(a),_pM128i(b), c)); -} - - -int64x1_t vsri_n_s64(int64x1_t a, int64x1_t b, __constrange(1,64) int c); // VSRI.64 d0,d0,#64 -_NEON2SSE_INLINE int64x1_t vsri_n_s64(int64x1_t a, int64x1_t b, __constrange(1,64) int c) -{ - int64x1_t res; - if (c ==64) - res = a; - else{ - res.m64_i64[0] = (b.m64_u64[0] >> c) | ((a.m64_i64[0] >> (64 - c)) << (64 - c)); //treat b as unsigned for shift to get leading zeros - } - return res; -} - -uint8x8_t vsri_n_u8(uint8x8_t a, uint8x8_t b, __constrange(1,8) int c); // VSRI.8 d0,d0,#8 -#define vsri_n_u8 vsri_n_s8 - -uint16x4_t vsri_n_u16(uint16x4_t a, uint16x4_t b, __constrange(1,16) int c); // VSRI.16 d0,d0,#16 -#define vsri_n_u16 vsri_n_s16 - -uint32x2_t vsri_n_u32(uint32x2_t a, uint32x2_t b, __constrange(1,32) int c); // VSRI.32 d0,d0,#32 -#define vsri_n_u32 vsri_n_s32 - - -uint64x1_t vsri_n_u64(uint64x1_t a, uint64x1_t b, __constrange(1,64) int c); // VSRI.64 d0,d0,#64 -#define vsri_n_u64 vsri_n_s64 - -poly8x8_t vsri_n_p8(poly8x8_t a, poly8x8_t b, __constrange(1,8) int c); // VSRI.8 d0,d0,#8 -#define vsri_n_p8 vsri_n_u8 - -poly16x4_t vsri_n_p16(poly16x4_t a, poly16x4_t b, __constrange(1,16) int c); // VSRI.16 d0,d0,#16 -#define vsri_n_p16 vsri_n_u16 - -int8x16_t vsriq_n_s8(int8x16_t a, int8x16_t b, __constrange(1,8) int c); // VSRI.8 q0,q0,#8 -_NEON2SSE_INLINE int8x16_t vsriq_n_s8(int8x16_t a, int8x16_t b, __constrange(1,8) int c) // VSRI.8 q0,q0,#8 -{ - __m128i maskA, a_masked; - uint8x16_t b_shift; - _NEON2SSE_ALIGN_16 uint8_t maskLeft[9] = {0x0, 0x80, 0xc0, 0xe0, 0xf0, 0xf8, 0xfc, 0xfe, 0xff}; //"a" bits mask, 0 bit not used - maskA = _mm_set1_epi8(maskLeft[c]); // c ones and (8-c)zeros - a_masked = _mm_and_si128 (a, maskA); - b_shift = vshrq_n_u8( b, c); // c zeros on the left in b due to logical shift - return _mm_or_si128 (a_masked, b_shift); //combine (insert b into a) -} - -int16x8_t vsriq_n_s16(int16x8_t a, int16x8_t b, __constrange(1,16) int c); // VSRI.16 q0,q0,#16 -_NEON2SSE_INLINE int16x8_t vsriq_n_s16(int16x8_t a, int16x8_t b, __constrange(1,16) int c) // VSRI.16 q0,q0,#16 -{ - //to cut "c" left bits from a we do shift right and then shift back left providing c right zeros in a - uint16x8_t b_shift; - uint16x8_t a_c; - b_shift = vshrq_n_u16( b, c); // c zeros on the left in b due to logical shift - a_c = vshrq_n_u16( a, (16 - c)); - a_c = _mm_slli_epi16(a_c, (16 - c)); //logical shift provides right "c" bits zeros in a - return _mm_or_si128 (a_c, b_shift); //combine (insert b into a) -} - -int32x4_t vsriq_n_s32(int32x4_t a, int32x4_t b, __constrange(1,32) int c); // VSRI.32 q0,q0,#32 -_NEON2SSE_INLINE int32x4_t vsriq_n_s32(int32x4_t a, int32x4_t b, __constrange(1,32) int c) // VSRI.32 q0,q0,#32 -{ - //to cut "c" left bits from a we do shift right and then shift back left providing c right zeros in a - uint32x4_t b_shift; - uint32x4_t a_c; - b_shift = vshrq_n_u32( b, c); // c zeros on the left in b due to logical shift - a_c = vshrq_n_u32( a, (32 - c)); - a_c = _mm_slli_epi32(a_c, (32 - c)); //logical shift provides right "c" bits zeros in a - return _mm_or_si128 (a_c, b_shift); //combine (insert b into a) -} - -int64x2_t vsriq_n_s64(int64x2_t a, int64x2_t b, __constrange(1,64) int c); // VSRI.64 q0,q0,#64 -_NEON2SSE_INLINE int64x2_t vsriq_n_s64(int64x2_t a, int64x2_t b, __constrange(1,64) int c) -{ - //serial solution may be faster - uint64x2_t b_shift; - uint64x2_t a_c; - b_shift = _mm_srli_epi64(b, c); // c zeros on the left in b due to logical shift - a_c = _mm_srli_epi64(a, (64 - c)); - a_c = _mm_slli_epi64(a_c, (64 - c)); //logical shift provides right "c" bits zeros in a - return _mm_or_si128 (a_c, b_shift); //combine (insert b into a) -} - -uint8x16_t vsriq_n_u8(uint8x16_t a, uint8x16_t b, __constrange(1,8) int c); // VSRI.8 q0,q0,#8 -#define vsriq_n_u8 vsriq_n_s8 - -uint16x8_t vsriq_n_u16(uint16x8_t a, uint16x8_t b, __constrange(1,16) int c); // VSRI.16 q0,q0,#16 -#define vsriq_n_u16 vsriq_n_s16 - -uint32x4_t vsriq_n_u32(uint32x4_t a, uint32x4_t b, __constrange(1,32) int c); // VSRI.32 q0,q0,#32 -#define vsriq_n_u32 vsriq_n_s32 - -uint64x2_t vsriq_n_u64(uint64x2_t a, uint64x2_t b, __constrange(1,64) int c); // VSRI.64 q0,q0,#64 -#define vsriq_n_u64 vsriq_n_s64 - -poly8x16_t vsriq_n_p8(poly8x16_t a, poly8x16_t b, __constrange(1,8) int c); // VSRI.8 q0,q0,#8 -#define vsriq_n_p8 vsriq_n_u8 - -poly16x8_t vsriq_n_p16(poly16x8_t a, poly16x8_t b, __constrange(1,16) int c); // VSRI.16 q0,q0,#16 -#define vsriq_n_p16 vsriq_n_u16 - -//***** Vector shift left and insert ********************************************* -//********************************************************************************* -//Actually the "c" right bits from "a" are the only bits remained from "a" after the shift. -//All other bits are taken from b shifted. Ending zeros are inserted in b in the shift proces. We need to combine "a" and "b shifted". -int8x8_t vsli_n_s8(int8x8_t a, int8x8_t b, __constrange(0,7) int c); // VSLI.8 d0,d0,#0 -_NEON2SSE_INLINE int8x8_t vsli_n_s8(int8x8_t a, int8x8_t b, __constrange(0,7) int c) -{ - int8x8_t res64; - return64(vsliq_n_s8(_pM128i(a),_pM128i(b), c)); -} - - -int16x4_t vsli_n_s16(int16x4_t a, int16x4_t b, __constrange(0,15) int c); // VSLI.16 d0,d0,#0 -_NEON2SSE_INLINE int16x4_t vsli_n_s16(int16x4_t a, int16x4_t b, __constrange(0,15) int c) -{ - int16x4_t res64; - return64(vsliq_n_s16(_pM128i(a),_pM128i(b), c)); -} - - -int32x2_t vsli_n_s32(int32x2_t a, int32x2_t b, __constrange(0,31) int c); // VSLI.32 d0,d0,#0 -_NEON2SSE_INLINE int32x2_t vsli_n_s32(int32x2_t a, int32x2_t b, __constrange(0,31) int c) -{ - int32x2_t res64; - return64(vsliq_n_s32(_pM128i(a),_pM128i(b), c)); -} - -int64x1_t vsli_n_s64(int64x1_t a, int64x1_t b, __constrange(0,63) int c); // VSLI.64 d0,d0,#0 -_NEON2SSE_INLINE int64x1_t vsli_n_s64(int64x1_t a, int64x1_t b, __constrange(0,63) int c) -{ - int64x1_t res; - res.m64_i64[0] = (b.m64_i64[0] << c) | ((a.m64_u64[0] << (64 - c)) >> (64 - c)); //need to treat a as unsigned to get leading zeros - return res; -} - - -uint8x8_t vsli_n_u8(uint8x8_t a, uint8x8_t b, __constrange(0,7) int c); // VSLI.8 d0,d0,#0 -#define vsli_n_u8 vsli_n_s8 - -uint16x4_t vsli_n_u16(uint16x4_t a, uint16x4_t b, __constrange(0,15) int c); // VSLI.16 d0,d0,#0 -#define vsli_n_u16 vsli_n_s16 - -uint32x2_t vsli_n_u32(uint32x2_t a, uint32x2_t b, __constrange(0,31) int c); // VSLI.32 d0,d0,#0 -#define vsli_n_u32 vsli_n_s32 - -uint64x1_t vsli_n_u64(uint64x1_t a, uint64x1_t b, __constrange(0,63) int c); // VSLI.64 d0,d0,#0 -#define vsli_n_u64 vsli_n_s64 - -poly8x8_t vsli_n_p8(poly8x8_t a, poly8x8_t b, __constrange(0,7) int c); // VSLI.8 d0,d0,#0 -#define vsli_n_p8 vsli_n_u8 - -poly16x4_t vsli_n_p16(poly16x4_t a, poly16x4_t b, __constrange(0,15) int c); // VSLI.16 d0,d0,#0 -#define vsli_n_p16 vsli_n_u16 - -int8x16_t vsliq_n_s8(int8x16_t a, int8x16_t b, __constrange(0,7) int c); // VSLI.8 q0,q0,#0 -_NEON2SSE_INLINE int8x16_t vsliq_n_s8(int8x16_t a, int8x16_t b, __constrange(0,7) int c) // VSLI.8 q0,q0,#0 -{ - __m128i maskA, a_masked; - int8x16_t b_shift; - _NEON2SSE_ALIGN_16 uint8_t maskRight[8] = {0x0, 0x1, 0x3, 0x7, 0x0f, 0x1f, 0x3f, 0x7f}; //"a" bits mask - maskA = _mm_set1_epi8(maskRight[c]); // (8-c)zeros and c ones - b_shift = vshlq_n_s8( b, c); - a_masked = _mm_and_si128 (a, maskA); - return _mm_or_si128 (b_shift, a_masked); //combine (insert b into a) -} - -int16x8_t vsliq_n_s16(int16x8_t a, int16x8_t b, __constrange(0,15) int c); // VSLI.16 q0,q0,#0 -_NEON2SSE_INLINE int16x8_t vsliq_n_s16(int16x8_t a, int16x8_t b, __constrange(0,15) int c) // VSLI.16 q0,q0,#0 -{ - //to cut "c" right bits from a we do shift left and then logical shift back right providing (16-c)zeros in a - int16x8_t b_shift; - int16x8_t a_c; - b_shift = vshlq_n_s16( b, c); - a_c = vshlq_n_s16( a, (16 - c)); - a_c = _mm_srli_epi16(a_c, (16 - c)); - return _mm_or_si128 (b_shift, a_c); //combine (insert b into a) -} - -int32x4_t vsliq_n_s32(int32x4_t a, int32x4_t b, __constrange(0,31) int c); // VSLI.32 q0,q0,#0 -_NEON2SSE_INLINE int32x4_t vsliq_n_s32(int32x4_t a, int32x4_t b, __constrange(0,31) int c) // VSLI.32 q0,q0,#0 -{ - //solution may be not optimal compared with the serial one - //to cut "c" right bits from a we do shift left and then logical shift back right providing (32-c)zeros in a - int32x4_t b_shift; - int32x4_t a_c; - b_shift = vshlq_n_s32( b, c); - a_c = vshlq_n_s32( a, (32 - c)); - a_c = _mm_srli_epi32(a_c, (32 - c)); - return _mm_or_si128 (b_shift, a_c); //combine (insert b into a) -} - -int64x2_t vsliq_n_s64(int64x2_t a, int64x2_t b, __constrange(0,63) int c); // VSLI.64 q0,q0,#0 -_NEON2SSE_INLINE int64x2_t vsliq_n_s64(int64x2_t a, int64x2_t b, __constrange(0,63) int c) // VSLI.64 q0,q0,#0 -{ - //solution may be not optimal compared with the serial one - //to cut "c" right bits from a we do shift left and then logical shift back right providing (64-c)zeros in a - int64x2_t b_shift; - int64x2_t a_c; - b_shift = vshlq_n_s64( b, c); - a_c = vshlq_n_s64( a, (64 - c)); - a_c = _mm_srli_epi64(a_c, (64 - c)); - return _mm_or_si128 (b_shift, a_c); //combine (insert b into a) -} - -uint8x16_t vsliq_n_u8(uint8x16_t a, uint8x16_t b, __constrange(0,7) int c); // VSLI.8 q0,q0,#0 -#define vsliq_n_u8 vsliq_n_s8 - -uint16x8_t vsliq_n_u16(uint16x8_t a, uint16x8_t b, __constrange(0,15) int c); // VSLI.16 q0,q0,#0 -#define vsliq_n_u16 vsliq_n_s16 - -uint32x4_t vsliq_n_u32(uint32x4_t a, uint32x4_t b, __constrange(0,31) int c); // VSLI.32 q0,q0,#0 -#define vsliq_n_u32 vsliq_n_s32 - -uint64x2_t vsliq_n_u64(uint64x2_t a, uint64x2_t b, __constrange(0,63) int c); // VSLI.64 q0,q0,#0 -#define vsliq_n_u64 vsliq_n_s64 - -poly8x16_t vsliq_n_p8(poly8x16_t a, poly8x16_t b, __constrange(0,7) int c); // VSLI.8 q0,q0,#0 -#define vsliq_n_p8 vsliq_n_u8 - -poly16x8_t vsliq_n_p16(poly16x8_t a, poly16x8_t b, __constrange(0,15) int c); // VSLI.16 q0,q0,#0 -#define vsliq_n_p16 vsliq_n_u16 - -// *********************************************************************************************** -// ****************** Loads and stores of a single vector *************************************** -// *********************************************************************************************** -//Performs loads and stores of a single vector of some type. -//******************************* Loads ******************************************************** -// *********************************************************************************************** -//We assume ptr is NOT aligned in general case and use __m128i _mm_loadu_si128 ((__m128i*) ptr);. -//also for SSE3 supporting systems the __m128i _mm_lddqu_si128 (__m128i const* p) usage for unaligned access may be advantageous. -// it loads a 32-byte block aligned on a 16-byte boundary and extracts the 16 bytes corresponding to the unaligned access -//If the ptr is aligned then could use __m128i _mm_load_si128 ((__m128i*) ptr) instead; -#define LOAD_SI128(ptr) \ - ( ((unsigned long)(ptr) & 15) == 0 ) ? _mm_load_si128((__m128i*)(ptr)) : _mm_loadu_si128((__m128i*)(ptr)); - -uint8x16_t vld1q_u8(__transfersize(16) uint8_t const * ptr); // VLD1.8 {d0, d1}, [r0] -#define vld1q_u8 LOAD_SI128 - -uint16x8_t vld1q_u16(__transfersize(8) uint16_t const * ptr); // VLD1.16 {d0, d1}, [r0] -#define vld1q_u16 LOAD_SI128 - -uint32x4_t vld1q_u32(__transfersize(4) uint32_t const * ptr); // VLD1.32 {d0, d1}, [r0] -#define vld1q_u32 LOAD_SI128 - -uint64x2_t vld1q_u64(__transfersize(2) uint64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -#define vld1q_u64 LOAD_SI128 - -int8x16_t vld1q_s8(__transfersize(16) int8_t const * ptr); // VLD1.8 {d0, d1}, [r0] -#define vld1q_s8 LOAD_SI128 - -int16x8_t vld1q_s16(__transfersize(8) int16_t const * ptr); // VLD1.16 {d0, d1}, [r0] -#define vld1q_s16 LOAD_SI128 - -int32x4_t vld1q_s32(__transfersize(4) int32_t const * ptr); // VLD1.32 {d0, d1}, [r0] -#define vld1q_s32 LOAD_SI128 - -int64x2_t vld1q_s64(__transfersize(2) int64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -#define vld1q_s64 LOAD_SI128 - -float16x8_t vld1q_f16(__transfersize(8) __fp16 const * ptr); // VLD1.16 {d0, d1}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers -/* _NEON2SSE_INLINE float16x8_t vld1q_f16(__transfersize(8) __fp16 const * ptr)// VLD1.16 {d0, d1}, [r0] -{__m128 f1 = _mm_set_ps (ptr[3], ptr[2], ptr[1], ptr[0]); -__m128 f2; -f2 = _mm_set_ps (ptr[7], ptr[6], ptr[5], ptr[4]); -}*/ - -float32x4_t vld1q_f32(__transfersize(4) float32_t const * ptr); // VLD1.32 {d0, d1}, [r0] -_NEON2SSE_INLINE float32x4_t vld1q_f32(__transfersize(4) float32_t const * ptr) -{ - if( (((unsigned long)(ptr)) & 15 ) == 0 ) //16 bits aligned - return _mm_load_ps(ptr); - else - return _mm_loadu_ps(ptr); -} - -poly8x16_t vld1q_p8(__transfersize(16) poly8_t const * ptr); // VLD1.8 {d0, d1}, [r0] -#define vld1q_p8 LOAD_SI128 - -poly16x8_t vld1q_p16(__transfersize(8) poly16_t const * ptr); // VLD1.16 {d0, d1}, [r0] -#define vld1q_p16 LOAD_SI128 - -uint8x8_t vld1_u8(__transfersize(8) uint8_t const * ptr); // VLD1.8 {d0}, [r0] -#define vld1_u8(ptr) *((__m64_128*)(ptr)) //was _mm_loadl_epi64((__m128i*)(ptr)) - -uint16x4_t vld1_u16(__transfersize(4) uint16_t const * ptr); // VLD1.16 {d0}, [r0] -#define vld1_u16 vld1_u8 - -uint32x2_t vld1_u32(__transfersize(2) uint32_t const * ptr); // VLD1.32 {d0}, [r0] -#define vld1_u32 vld1_u8 - - -uint64x1_t vld1_u64(__transfersize(1) uint64_t const * ptr); // VLD1.64 {d0}, [r0] -#define vld1_u64 vld1_u8 - -int8x8_t vld1_s8(__transfersize(8) int8_t const * ptr); // VLD1.8 {d0}, [r0] -#define vld1_s8 vld1_u8 - -int16x4_t vld1_s16(__transfersize(4) int16_t const * ptr); // VLD1.16 {d0}, [r0] -#define vld1_s16 vld1_u16 - -int32x2_t vld1_s32(__transfersize(2) int32_t const * ptr); // VLD1.32 {d0}, [r0] -#define vld1_s32 vld1_u32 - -int64x1_t vld1_s64(__transfersize(1) int64_t const * ptr); // VLD1.64 {d0}, [r0] -#define vld1_s64 vld1_u64 - -float16x4_t vld1_f16(__transfersize(4) __fp16 const * ptr); // VLD1.16 {d0}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit like _mm_set_ps (ptr[3], ptr[2], ptr[1], ptr[0]); - -float32x2_t vld1_f32(__transfersize(2) float32_t const * ptr); // VLD1.32 {d0}, [r0] -_NEON2SSE_INLINE float32x2_t vld1_f32(__transfersize(2) float32_t const * ptr) -{ - float32x2_t res; - res.m64_f32[0] = *(ptr); - res.m64_f32[1] = *(ptr + 1); - return res; -} - -poly8x8_t vld1_p8(__transfersize(8) poly8_t const * ptr); // VLD1.8 {d0}, [r0] -#define vld1_p8 vld1_u8 - -poly16x4_t vld1_p16(__transfersize(4) poly16_t const * ptr); // VLD1.16 {d0}, [r0] -#define vld1_p16 vld1_u16 - -//*********************************************************************************************************** -//******* Lane load functions - insert the data at vector's given position (lane) ************************* -//*********************************************************************************************************** -uint8x16_t vld1q_lane_u8(__transfersize(1) uint8_t const * ptr, uint8x16_t vec, __constrange(0,15) int lane); // VLD1.8 {d0[0]}, [r0] -#define vld1q_lane_u8(ptr, vec, lane) _MM_INSERT_EPI8(vec, *(ptr), lane) - -uint16x8_t vld1q_lane_u16(__transfersize(1) uint16_t const * ptr, uint16x8_t vec, __constrange(0,7) int lane); // VLD1.16 {d0[0]}, [r0] -#define vld1q_lane_u16(ptr, vec, lane) _MM_INSERT_EPI16(vec, *(ptr), lane) - -uint32x4_t vld1q_lane_u32(__transfersize(1) uint32_t const * ptr, uint32x4_t vec, __constrange(0,3) int lane); // VLD1.32 {d0[0]}, [r0] -#define vld1q_lane_u32(ptr, vec, lane) _MM_INSERT_EPI32(vec, *(ptr), lane) - -uint64x2_t vld1q_lane_u64(__transfersize(1) uint64_t const * ptr, uint64x2_t vec, __constrange(0,1) int lane); // VLD1.64 {d0}, [r0] -#define vld1q_lane_u64(ptr, vec, lane) _MM_INSERT_EPI64(vec, *(ptr), lane); // _p; - - -int8x16_t vld1q_lane_s8(__transfersize(1) int8_t const * ptr, int8x16_t vec, __constrange(0,15) int lane); // VLD1.8 {d0[0]}, [r0] -#define vld1q_lane_s8(ptr, vec, lane) _MM_INSERT_EPI8(vec, *(ptr), lane) - -int16x8_t vld1q_lane_s16(__transfersize(1) int16_t const * ptr, int16x8_t vec, __constrange(0,7) int lane); // VLD1.16 {d0[0]}, [r0] -#define vld1q_lane_s16(ptr, vec, lane) _MM_INSERT_EPI16(vec, *(ptr), lane) - -int32x4_t vld1q_lane_s32(__transfersize(1) int32_t const * ptr, int32x4_t vec, __constrange(0,3) int lane); // VLD1.32 {d0[0]}, [r0] -#define vld1q_lane_s32(ptr, vec, lane) _MM_INSERT_EPI32(vec, *(ptr), lane) - -float16x8_t vld1q_lane_f16(__transfersize(1) __fp16 const * ptr, float16x8_t vec, __constrange(0,7) int lane); // VLD1.16 {d0[0]}, [r0] -//current IA SIMD doesn't support float16 - -float32x4_t vld1q_lane_f32(__transfersize(1) float32_t const * ptr, float32x4_t vec, __constrange(0,3) int lane); // VLD1.32 {d0[0]}, [r0] -_NEON2SSE_INLINE float32x4_t vld1q_lane_f32(__transfersize(1) float32_t const * ptr, float32x4_t vec, __constrange(0,3) int lane) -{ - //we need to deal with ptr 16bit NOT aligned case - __m128 p; - p = _mm_set1_ps(*(ptr)); - return _MM_INSERT_PS(vec, p, _INSERTPS_NDX(0, lane)); -} - -int64x2_t vld1q_lane_s64(__transfersize(1) int64_t const * ptr, int64x2_t vec, __constrange(0,1) int lane); // VLD1.64 {d0}, [r0] -#define vld1q_lane_s64(ptr, vec, lane) _MM_INSERT_EPI64(vec, *(ptr), lane) - -poly8x16_t vld1q_lane_p8(__transfersize(1) poly8_t const * ptr, poly8x16_t vec, __constrange(0,15) int lane); // VLD1.8 {d0[0]}, [r0] -#define vld1q_lane_p8(ptr, vec, lane) _MM_INSERT_EPI8(vec, *(ptr), lane) - -poly16x8_t vld1q_lane_p16(__transfersize(1) poly16_t const * ptr, poly16x8_t vec, __constrange(0,7) int lane); // VLD1.16 {d0[0]}, [r0] -#define vld1q_lane_p16(ptr, vec, lane) _MM_INSERT_EPI16(vec, *(ptr), lane) - -uint8x8_t vld1_lane_u8(__transfersize(1) uint8_t const * ptr, uint8x8_t vec, __constrange(0,7) int lane); // VLD1.8 {d0[0]}, [r0] -_NEON2SSE_INLINE uint8x8_t vld1_lane_u8(__transfersize(1) uint8_t const * ptr, uint8x8_t vec, __constrange(0,7) int lane) -{ - uint8x8_t res; - res = vec; - res.m64_u8[lane] = *(ptr); - return res; -} - -uint16x4_t vld1_lane_u16(__transfersize(1) uint16_t const * ptr, uint16x4_t vec, __constrange(0,3) int lane); // VLD1.16 {d0[0]}, [r0] -_NEON2SSE_INLINE uint16x4_t vld1_lane_u16(__transfersize(1) uint16_t const * ptr, uint16x4_t vec, __constrange(0,3) int lane) -{ - uint16x4_t res; - res = vec; - res.m64_u16[lane] = *(ptr); - return res; -} - -uint32x2_t vld1_lane_u32(__transfersize(1) uint32_t const * ptr, uint32x2_t vec, __constrange(0,1) int lane); // VLD1.32 {d0[0]}, [r0] -_NEON2SSE_INLINE uint32x2_t vld1_lane_u32(__transfersize(1) uint32_t const * ptr, uint32x2_t vec, __constrange(0,1) int lane) -{ - uint32x2_t res; - res = vec; - res.m64_u32[lane] = *(ptr); - return res; -} - -uint64x1_t vld1_lane_u64(__transfersize(1) uint64_t const * ptr, uint64x1_t vec, __constrange(0,0) int lane); // VLD1.64 {d0}, [r0] -_NEON2SSE_INLINE uint64x1_t vld1_lane_u64(__transfersize(1) uint64_t const * ptr, uint64x1_t vec, __constrange(0,0) int lane) -{ - uint64x1_t res; - res.m64_u64[0] = *(ptr); - return res; -} - - -int8x8_t vld1_lane_s8(__transfersize(1) int8_t const * ptr, int8x8_t vec, __constrange(0,7) int lane); // VLD1.8 {d0[0]}, [r0] -#define vld1_lane_s8(ptr, vec, lane) vld1_lane_u8((uint8_t*)ptr, vec, lane) - -int16x4_t vld1_lane_s16(__transfersize(1) int16_t const * ptr, int16x4_t vec, __constrange(0,3) int lane); // VLD1.16 {d0[0]}, [r0] -#define vld1_lane_s16(ptr, vec, lane) vld1_lane_u16((uint16_t*)ptr, vec, lane) - -int32x2_t vld1_lane_s32(__transfersize(1) int32_t const * ptr, int32x2_t vec, __constrange(0,1) int lane); // VLD1.32 {d0[0]}, [r0] -#define vld1_lane_s32(ptr, vec, lane) vld1_lane_u32((uint32_t*)ptr, vec, lane) - -float16x4_t vld1_lane_f16(__transfersize(1) __fp16 const * ptr, float16x4_t vec, __constrange(0,3) int lane); // VLD1.16 {d0[0]}, [r0] -//current IA SIMD doesn't support float16 - -float32x2_t vld1_lane_f32(__transfersize(1) float32_t const * ptr, float32x2_t vec, __constrange(0,1) int lane); // VLD1.32 {d0[0]}, [r0] -_NEON2SSE_INLINE float32x2_t vld1_lane_f32(__transfersize(1) float32_t const * ptr, float32x2_t vec, __constrange(0,1) int lane) -{ - float32x2_t res; - res = vec; - res.m64_f32[lane] = *(ptr); - return res; -} - -int64x1_t vld1_lane_s64(__transfersize(1) int64_t const * ptr, int64x1_t vec, __constrange(0,0) int lane); // VLD1.64 {d0}, [r0] -#define vld1_lane_s64(ptr, vec, lane) vld1_lane_u64((uint64_t*)ptr, vec, lane) - -poly8x8_t vld1_lane_p8(__transfersize(1) poly8_t const * ptr, poly8x8_t vec, __constrange(0,7) int lane); // VLD1.8 {d0[0]}, [r0] -#define vld1_lane_p8 vld1_lane_u8 - -poly16x4_t vld1_lane_p16(__transfersize(1) poly16_t const * ptr, poly16x4_t vec, __constrange(0,3) int lane); // VLD1.16 {d0[0]}, [r0] -#define vld1_lane_p16 vld1_lane_s16 - -// ****************** Load single value ( set all lanes of vector with same value from memory)********************** -// ****************************************************************************************************************** -uint8x16_t vld1q_dup_u8(__transfersize(1) uint8_t const * ptr); // VLD1.8 {d0[]}, [r0] -#define vld1q_dup_u8(ptr) _mm_set1_epi8(*(ptr)) - -uint16x8_t vld1q_dup_u16(__transfersize(1) uint16_t const * ptr); // VLD1.16 {d0[]}, [r0] -#define vld1q_dup_u16(ptr) _mm_set1_epi16(*(ptr)) - -uint32x4_t vld1q_dup_u32(__transfersize(1) uint32_t const * ptr); // VLD1.32 {d0[]}, [r0] -#define vld1q_dup_u32(ptr) _mm_set1_epi32(*(ptr)) - -uint64x2_t vld1q_dup_u64(__transfersize(1) uint64_t const * ptr); // VLD1.64 {d0}, [r0] -_NEON2SSE_INLINE uint64x2_t vld1q_dup_u64(__transfersize(1) uint64_t const * ptr) -{ - _NEON2SSE_ALIGN_16 uint64_t val[2] = {*(ptr), *(ptr)}; - return LOAD_SI128(val); -} - -int8x16_t vld1q_dup_s8(__transfersize(1) int8_t const * ptr); // VLD1.8 {d0[]}, [r0] -#define vld1q_dup_s8(ptr) _mm_set1_epi8(*(ptr)) - -int16x8_t vld1q_dup_s16(__transfersize(1) int16_t const * ptr); // VLD1.16 {d0[]}, [r0] -#define vld1q_dup_s16(ptr) _mm_set1_epi16 (*(ptr)) - -int32x4_t vld1q_dup_s32(__transfersize(1) int32_t const * ptr); // VLD1.32 {d0[]}, [r0] -#define vld1q_dup_s32(ptr) _mm_set1_epi32 (*(ptr)) - -int64x2_t vld1q_dup_s64(__transfersize(1) int64_t const * ptr); // VLD1.64 {d0}, [r0] -#define vld1q_dup_s64(ptr) vld1q_dup_u64((uint64_t*)ptr) - -float16x8_t vld1q_dup_f16(__transfersize(1) __fp16 const * ptr); // VLD1.16 {d0[]}, [r0] -//current IA SIMD doesn't support float16, need to go to 32 bits - -float32x4_t vld1q_dup_f32(__transfersize(1) float32_t const * ptr); // VLD1.32 {d0[]}, [r0] -#define vld1q_dup_f32(ptr) _mm_set1_ps (*(ptr)) - -poly8x16_t vld1q_dup_p8(__transfersize(1) poly8_t const * ptr); // VLD1.8 {d0[]}, [r0] -#define vld1q_dup_p8(ptr) _mm_set1_epi8(*(ptr)) - -poly16x8_t vld1q_dup_p16(__transfersize(1) poly16_t const * ptr); // VLD1.16 {d0[]}, [r0] -#define vld1q_dup_p16(ptr) _mm_set1_epi16 (*(ptr)) - -uint8x8_t vld1_dup_u8(__transfersize(1) uint8_t const * ptr); // VLD1.8 {d0[]}, [r0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint8x8_t vld1_dup_u8(__transfersize(1) uint8_t const * ptr), _NEON2SSE_REASON_SLOW_SERIAL) -{ - uint8x8_t res; - int i; - for(i = 0; i<8; i++) { - res.m64_u8[i] = *(ptr); - } - return res; -} - -uint16x4_t vld1_dup_u16(__transfersize(1) uint16_t const * ptr); // VLD1.16 {d0[]}, [r0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint16x4_t vld1_dup_u16(__transfersize(1) uint16_t const * ptr), _NEON2SSE_REASON_SLOW_SERIAL) -{ - uint16x4_t res; - int i; - for(i = 0; i<4; i++) { - res.m64_u16[i] = *(ptr); - } - return res; -} - -uint32x2_t vld1_dup_u32(__transfersize(1) uint32_t const * ptr); // VLD1.32 {d0[]}, [r0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vld1_dup_u32(__transfersize(1) uint32_t const * ptr), _NEON2SSE_REASON_SLOW_SERIAL) -{ - uint32x2_t res; - res.m64_u32[0] = *(ptr); - res.m64_u32[1] = *(ptr); - return res; -} - -uint64x1_t vld1_dup_u64(__transfersize(1) uint64_t const * ptr); // VLD1.64 {d0}, [r0] -_NEON2SSE_INLINE uint64x1_t vld1_dup_u64(__transfersize(1) uint64_t const * ptr) -{ - uint64x1_t res; - res.m64_u64[0] = *(ptr); - return res; -} - -int8x8_t vld1_dup_s8(__transfersize(1) int8_t const * ptr); // VLD1.8 {d0[]}, [r0] -#define vld1_dup_s8(ptr) vld1_dup_u8((uint8_t*)ptr) - - -int16x4_t vld1_dup_s16(__transfersize(1) int16_t const * ptr); // VLD1.16 {d0[]}, [r0] -#define vld1_dup_s16(ptr) vld1_dup_u16((uint16_t*)ptr) - - -int32x2_t vld1_dup_s32(__transfersize(1) int32_t const * ptr); // VLD1.32 {d0[]}, [r0] -#define vld1_dup_s32(ptr) vld1_dup_u32((uint32_t*)ptr) - - -int64x1_t vld1_dup_s64(__transfersize(1) int64_t const * ptr); // VLD1.64 {d0}, [r0] -#define vld1_dup_s64(ptr) vld1_dup_u64((uint64_t*)ptr) - -float16x4_t vld1_dup_f16(__transfersize(1) __fp16 const * ptr); // VLD1.16 {d0[]}, [r0] -//current IA SIMD doesn't support float16 - -float32x2_t vld1_dup_f32(__transfersize(1) float32_t const * ptr); // VLD1.32 {d0[]}, [r0] -_NEON2SSE_INLINE float32x2_t vld1_dup_f32(__transfersize(1) float32_t const * ptr) -{ - float32x2_t res; - res.m64_f32[0] = *(ptr); - res.m64_f32[1] = res.m64_f32[0]; - return res; // use last 64bits only -} - -poly8x8_t vld1_dup_p8(__transfersize(1) poly8_t const * ptr); // VLD1.8 {d0[]}, [r0] -#define vld1_dup_p8 vld1_dup_u8 - - -poly16x4_t vld1_dup_p16(__transfersize(1) poly16_t const * ptr); // VLD1.16 {d0[]}, [r0] -#define vld1_dup_p16 vld1_dup_u16 - - -//************************************************************************************* -//********************************* Store ********************************************** -//************************************************************************************* -// If ptr is 16bit aligned and you need to store data without cache pollution then use void _mm_stream_si128 ((__m128i*)ptr, val); -//here we assume the case of NOT 16bit aligned ptr possible. If it is aligned we could to use _mm_store_si128 like shown in the following macro -#define STORE_SI128(ptr, val) \ - (((unsigned long)(ptr) & 15) == 0 ) ? _mm_store_si128 ((__m128i*)(ptr), val) : _mm_storeu_si128 ((__m128i*)(ptr), val); - -void vst1q_u8(__transfersize(16) uint8_t * ptr, uint8x16_t val); // VST1.8 {d0, d1}, [r0] -#define vst1q_u8 STORE_SI128 - -void vst1q_u16(__transfersize(8) uint16_t * ptr, uint16x8_t val); // VST1.16 {d0, d1}, [r0] -#define vst1q_u16 STORE_SI128 - -void vst1q_u32(__transfersize(4) uint32_t * ptr, uint32x4_t val); // VST1.32 {d0, d1}, [r0] -#define vst1q_u32 STORE_SI128 - -void vst1q_u64(__transfersize(2) uint64_t * ptr, uint64x2_t val); // VST1.64 {d0, d1}, [r0] -#define vst1q_u64 STORE_SI128 - -void vst1q_s8(__transfersize(16) int8_t * ptr, int8x16_t val); // VST1.8 {d0, d1}, [r0] -#define vst1q_s8 STORE_SI128 - -void vst1q_s16(__transfersize(8) int16_t * ptr, int16x8_t val); // VST1.16 {d0, d1}, [r0] -#define vst1q_s16 STORE_SI128 - -void vst1q_s32(__transfersize(4) int32_t * ptr, int32x4_t val); // VST1.32 {d0, d1}, [r0] -#define vst1q_s32 STORE_SI128 - -void vst1q_s64(__transfersize(2) int64_t * ptr, int64x2_t val); // VST1.64 {d0, d1}, [r0] -#define vst1q_s64 STORE_SI128 - -void vst1q_f16(__transfersize(8) __fp16 * ptr, float16x8_t val); // VST1.16 {d0, d1}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently - -void vst1q_f32(__transfersize(4) float32_t * ptr, float32x4_t val); // VST1.32 {d0, d1}, [r0] -_NEON2SSE_INLINE void vst1q_f32(__transfersize(4) float32_t * ptr, float32x4_t val) -{ - if( ((unsigned long)(ptr) & 15) == 0 ) //16 bits aligned - _mm_store_ps (ptr, val); - else - _mm_storeu_ps (ptr, val); -} - -void vst1q_p8(__transfersize(16) poly8_t * ptr, poly8x16_t val); // VST1.8 {d0, d1}, [r0] -#define vst1q_p8 vst1q_u8 - -void vst1q_p16(__transfersize(8) poly16_t * ptr, poly16x8_t val); // VST1.16 {d0, d1}, [r0] -#define vst1q_p16 vst1q_u16 - -void vst1_u8(__transfersize(8) uint8_t * ptr, uint8x8_t val); // VST1.8 {d0}, [r0] -_NEON2SSE_INLINE void vst1_u8(__transfersize(8) uint8_t * ptr, uint8x8_t val) -{ - int i; - for (i = 0; i<8; i++) { - *(ptr + i) = ((uint8_t*)&val)[i]; - } - //_mm_storel_epi64((__m128i*)ptr, val); - return; -} - -void vst1_u16(__transfersize(4) uint16_t * ptr, uint16x4_t val); // VST1.16 {d0}, [r0] -_NEON2SSE_INLINE void vst1_u16(__transfersize(4) uint16_t * ptr, uint16x4_t val) -{ - int i; - for (i = 0; i<4; i++) { - *(ptr + i) = ((uint16_t*)&val)[i]; - } - //_mm_storel_epi64((__m128i*)ptr, val); - return; -} - -void vst1_u32(__transfersize(2) uint32_t * ptr, uint32x2_t val); // VST1.32 {d0}, [r0] -_NEON2SSE_INLINE void vst1_u32(__transfersize(2) uint32_t * ptr, uint32x2_t val) -{ - int i; - for (i = 0; i<2; i++) { - *(ptr + i) = ((uint32_t*)&val)[i]; - } - //_mm_storel_epi64((__m128i*)ptr, val); - return; -} - -void vst1_u64(__transfersize(1) uint64_t * ptr, uint64x1_t val); // VST1.64 {d0}, [r0] -_NEON2SSE_INLINE void vst1_u64(__transfersize(1) uint64_t * ptr, uint64x1_t val) -{ - *(ptr) = *((uint64_t*)&val); - //_mm_storel_epi64((__m128i*)ptr, val); - return; -} - -void vst1_s8(__transfersize(8) int8_t * ptr, int8x8_t val); // VST1.8 {d0}, [r0] -#define vst1_s8(ptr,val) vst1_u8((uint8_t*)ptr,val) - -void vst1_s16(__transfersize(4) int16_t * ptr, int16x4_t val); // VST1.16 {d0}, [r0] -#define vst1_s16(ptr,val) vst1_u16((uint16_t*)ptr,val) - -void vst1_s32(__transfersize(2) int32_t * ptr, int32x2_t val); // VST1.32 {d0}, [r0] -#define vst1_s32(ptr,val) vst1_u32((uint32_t*)ptr,val) - -void vst1_s64(__transfersize(1) int64_t * ptr, int64x1_t val); // VST1.64 {d0}, [r0] -#define vst1_s64(ptr,val) vst1_u64((uint64_t*)ptr,val) - -void vst1_f16(__transfersize(4) __fp16 * ptr, float16x4_t val); // VST1.16 {d0}, [r0] -//current IA SIMD doesn't support float16 - -void vst1_f32(__transfersize(2) float32_t * ptr, float32x2_t val); // VST1.32 {d0}, [r0] -_NEON2SSE_INLINE void vst1_f32(__transfersize(2) float32_t * ptr, float32x2_t val) -{ - *(ptr) = val.m64_f32[0]; - *(ptr + 1) = val.m64_f32[1]; - return; -} - -void vst1_p8(__transfersize(8) poly8_t * ptr, poly8x8_t val); // VST1.8 {d0}, [r0] -#define vst1_p8 vst1_u8 - -void vst1_p16(__transfersize(4) poly16_t * ptr, poly16x4_t val); // VST1.16 {d0}, [r0] -#define vst1_p16 vst1_u16 - -//***********Store a lane of a vector into memory (extract given lane) ********************* -//****************************************************************************************** -void vst1q_lane_u8(__transfersize(1) uint8_t * ptr, uint8x16_t val, __constrange(0,15) int lane); // VST1.8 {d0[0]}, [r0] -#define vst1q_lane_u8(ptr, val, lane) *(ptr) = _MM_EXTRACT_EPI8 (val, lane) - -void vst1q_lane_u16(__transfersize(1) uint16_t * ptr, uint16x8_t val, __constrange(0,7) int lane); // VST1.16 {d0[0]}, [r0] -#define vst1q_lane_u16(ptr, val, lane) *(ptr) = _MM_EXTRACT_EPI16 (val, lane) - -void vst1q_lane_u32(__transfersize(1) uint32_t * ptr, uint32x4_t val, __constrange(0,3) int lane); // VST1.32 {d0[0]}, [r0] -#define vst1q_lane_u32(ptr, val, lane) *(ptr) = _MM_EXTRACT_EPI32 (val, lane) - -void vst1q_lane_u64(__transfersize(1) uint64_t * ptr, uint64x2_t val, __constrange(0,1) int lane); // VST1.64 {d0}, [r0] -#define vst1q_lane_u64(ptr, val, lane) *(ptr) = _MM_EXTRACT_EPI64 (val, lane) - -void vst1q_lane_s8(__transfersize(1) int8_t * ptr, int8x16_t val, __constrange(0,15) int lane); // VST1.8 {d0[0]}, [r0] -#define vst1q_lane_s8(ptr, val, lane) *(ptr) = _MM_EXTRACT_EPI8 (val, lane) - -void vst1q_lane_s16(__transfersize(1) int16_t * ptr, int16x8_t val, __constrange(0,7) int lane); // VST1.16 {d0[0]}, [r0] -#define vst1q_lane_s16(ptr, val, lane) *(ptr) = _MM_EXTRACT_EPI16 (val, lane) - -void vst1q_lane_s32(__transfersize(1) int32_t * ptr, int32x4_t val, __constrange(0,3) int lane); // VST1.32 {d0[0]}, [r0] -#define vst1q_lane_s32(ptr, val, lane) *(ptr) = _MM_EXTRACT_EPI32 (val, lane) - -void vst1q_lane_s64(__transfersize(1) int64_t * ptr, int64x2_t val, __constrange(0,1) int lane); // VST1.64 {d0}, [r0] -#define vst1q_lane_s64(ptr, val, lane) *(ptr) = _MM_EXTRACT_EPI64 (val, lane) - -void vst1q_lane_f16(__transfersize(1) __fp16 * ptr, float16x8_t val, __constrange(0,7) int lane); // VST1.16 {d0[0]}, [r0] -//current IA SIMD doesn't support float16 - -void vst1q_lane_f32(__transfersize(1) float32_t * ptr, float32x4_t val, __constrange(0,3) int lane); // VST1.32 {d0[0]}, [r0] -_NEON2SSE_INLINE void vst1q_lane_f32(__transfersize(1) float32_t * ptr, float32x4_t val, __constrange(0,3) int lane) -{ - int32_t ilane; - ilane = _MM_EXTRACT_PS(val,lane); - *(ptr) = *((float*)&ilane); -} - -void vst1q_lane_p8(__transfersize(1) poly8_t * ptr, poly8x16_t val, __constrange(0,15) int lane); // VST1.8 {d0[0]}, [r0] -#define vst1q_lane_p8 vst1q_lane_u8 - -void vst1q_lane_p16(__transfersize(1) poly16_t * ptr, poly16x8_t val, __constrange(0,7) int lane); // VST1.16 {d0[0]}, [r0] -#define vst1q_lane_p16 vst1q_lane_s16 - -void vst1_lane_u8(__transfersize(1) uint8_t * ptr, uint8x8_t val, __constrange(0,7) int lane); // VST1.8 {d0[0]}, [r0] -_NEON2SSE_INLINE void vst1_lane_u8(__transfersize(1) uint8_t * ptr, uint8x8_t val, __constrange(0,7) int lane) -{ - *(ptr) = val.m64_u8[lane]; -} - -void vst1_lane_u16(__transfersize(1) uint16_t * ptr, uint16x4_t val, __constrange(0,3) int lane); // VST1.16 {d0[0]}, [r0] -_NEON2SSE_INLINE void vst1_lane_u16(__transfersize(1) uint16_t * ptr, uint16x4_t val, __constrange(0,3) int lane) -{ - *(ptr) = val.m64_u16[lane]; -} - -void vst1_lane_u32(__transfersize(1) uint32_t * ptr, uint32x2_t val, __constrange(0,1) int lane); // VST1.32 {d0[0]}, [r0] -_NEON2SSE_INLINE void vst1_lane_u32(__transfersize(1) uint32_t * ptr, uint32x2_t val, __constrange(0,1) int lane) -{ - *(ptr) = val.m64_u32[lane]; -} - -void vst1_lane_u64(__transfersize(1) uint64_t * ptr, uint64x1_t val, __constrange(0,0) int lane); // VST1.64 {d0}, [r0] -_NEON2SSE_INLINE void vst1_lane_u64(__transfersize(1) uint64_t * ptr, uint64x1_t val, __constrange(0,0) int lane) -{ - *(ptr) = val.m64_u64[0]; -} - -void vst1_lane_s8(__transfersize(1) int8_t * ptr, int8x8_t val, __constrange(0,7) int lane); // VST1.8 {d0[0]}, [r0] -#define vst1_lane_s8(ptr, val, lane) vst1_lane_u8((uint8_t*)ptr, val, lane) - -void vst1_lane_s16(__transfersize(1) int16_t * ptr, int16x4_t val, __constrange(0,3) int lane); // VST1.16 {d0[0]}, [r0] -#define vst1_lane_s16(ptr, val, lane) vst1_lane_u16((uint16_t*)ptr, val, lane) - -void vst1_lane_s32(__transfersize(1) int32_t * ptr, int32x2_t val, __constrange(0,1) int lane); // VST1.32 {d0[0]}, [r0] -#define vst1_lane_s32(ptr, val, lane) vst1_lane_u32((uint32_t*)ptr, val, lane) - - -void vst1_lane_s64(__transfersize(1) int64_t * ptr, int64x1_t val, __constrange(0,0) int lane); // VST1.64 {d0}, [r0] -#define vst1_lane_s64(ptr, val, lane) vst1_lane_u64((uint64_t*)ptr, val, lane) - - -void vst1_lane_f16(__transfersize(1) __fp16 * ptr, float16x4_t val, __constrange(0,3) int lane); // VST1.16 {d0[0]}, [r0] -//current IA SIMD doesn't support float16 - -void vst1_lane_f32(__transfersize(1) float32_t * ptr, float32x2_t val, __constrange(0,1) int lane); // VST1.32 {d0[0]}, [r0] -_NEON2SSE_INLINE void vst1_lane_f32(__transfersize(1) float32_t * ptr, float32x2_t val, __constrange(0,1) int lane) -{ - *(ptr) = val.m64_f32[lane]; -} - -void vst1_lane_p8(__transfersize(1) poly8_t * ptr, poly8x8_t val, __constrange(0,7) int lane); // VST1.8 {d0[0]}, [r0] -#define vst1_lane_p8 vst1_lane_u8 - -void vst1_lane_p16(__transfersize(1) poly16_t * ptr, poly16x4_t val, __constrange(0,3) int lane); // VST1.16 {d0[0]}, [r0] -#define vst1_lane_p16 vst1_lane_s16 - -//*********************************************************************************************** -//**************** Loads and stores of an N-element structure ********************************** -//*********************************************************************************************** -//These intrinsics load or store an n-element structure. The array structures are defined in the beginning -//We assume ptr is NOT aligned in general case, for more details see "Loads and stores of a single vector functions" -//****************** 2 elements load ********************************************* -uint8x16x2_t vld2q_u8(__transfersize(32) uint8_t const * ptr); // VLD2.8 {d0, d2}, [r0] -_NEON2SSE_INLINE uint8x16x2_t vld2q_u8(__transfersize(32) uint8_t const * ptr) // VLD2.8 {d0, d2}, [r0] -{ - uint8x16x2_t v; - v.val[0] = vld1q_u8(ptr); - v.val[1] = vld1q_u8((ptr + 16)); - v = vuzpq_s8(v.val[0], v.val[1]); - return v; -} - -uint16x8x2_t vld2q_u16(__transfersize(16) uint16_t const * ptr); // VLD2.16 {d0, d2}, [r0] -_NEON2SSE_INLINE uint16x8x2_t vld2q_u16(__transfersize(16) uint16_t const * ptr) // VLD2.16 {d0, d2}, [r0] -{ - uint16x8x2_t v; - v.val[0] = vld1q_u16( ptr); - v.val[1] = vld1q_u16( (ptr + 8)); - v = vuzpq_s16(v.val[0], v.val[1]); - return v; -} - -uint32x4x2_t vld2q_u32(__transfersize(8) uint32_t const * ptr); // VLD2.32 {d0, d2}, [r0] -_NEON2SSE_INLINE uint32x4x2_t vld2q_u32(__transfersize(8) uint32_t const * ptr) // VLD2.32 {d0, d2}, [r0] -{ - uint32x4x2_t v; - v.val[0] = vld1q_u32 ( ptr); - v.val[1] = vld1q_u32 ( (ptr + 4)); - v = vuzpq_s32(v.val[0], v.val[1]); - return v; -} - -int8x16x2_t vld2q_s8(__transfersize(32) int8_t const * ptr); -#define vld2q_s8(ptr) vld2q_u8((uint8_t*) ptr) - -int16x8x2_t vld2q_s16(__transfersize(16) int16_t const * ptr); // VLD2.16 {d0, d2}, [r0] -#define vld2q_s16(ptr) vld2q_u16((uint16_t*) ptr) - -int32x4x2_t vld2q_s32(__transfersize(8) int32_t const * ptr); // VLD2.32 {d0, d2}, [r0] -#define vld2q_s32(ptr) vld2q_u32((uint32_t*) ptr) - - -float16x8x2_t vld2q_f16(__transfersize(16) __fp16 const * ptr); // VLD2.16 {d0, d2}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers. See vld1q_f16 for example - -float32x4x2_t vld2q_f32(__transfersize(8) float32_t const * ptr); // VLD2.32 {d0, d2}, [r0] -_NEON2SSE_INLINE float32x4x2_t vld2q_f32(__transfersize(8) float32_t const * ptr) // VLD2.32 {d0, d2}, [r0] -{ - float32x4x2_t v; - v.val[0] = vld1q_f32 (ptr); - v.val[1] = vld1q_f32 ((ptr + 4)); - v = vuzpq_f32(v.val[0], v.val[1]); - return v; -} - -poly8x16x2_t vld2q_p8(__transfersize(32) poly8_t const * ptr); // VLD2.8 {d0, d2}, [r0] -#define vld2q_p8 vld2q_u8 - -poly16x8x2_t vld2q_p16(__transfersize(16) poly16_t const * ptr); // VLD2.16 {d0, d2}, [r0] -#define vld2q_p16 vld2q_u16 - -uint8x8x2_t vld2_u8(__transfersize(16) uint8_t const * ptr); // VLD2.8 {d0, d1}, [r0] -_NEON2SSE_INLINE uint8x8x2_t vld2_u8(__transfersize(16) uint8_t const * ptr) -{ - uint8x8x2_t v; - _NEON2SSE_ALIGN_16 int8_t mask8_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15}; - __m128i ld128; - ld128 = vld1q_u8(ptr); //merge two 64-bits in 128 bit - ld128 = _mm_shuffle_epi8(ld128, *(__m128i*)mask8_even_odd); - vst1q_u8((v.val), ld128); // v.val[1] = _mm_shuffle_epi32(v.val[0], _SWAP_HI_LOW32); - return v; -} - -uint16x4x2_t vld2_u16(__transfersize(8) uint16_t const * ptr); // VLD2.16 {d0, d1}, [r0] -_NEON2SSE_INLINE uint16x4x2_t vld2_u16(__transfersize(8) uint16_t const * ptr) -{ - _NEON2SSE_ALIGN_16 uint16x4x2_t v; - _NEON2SSE_ALIGN_16 int8_t mask16_even_odd[16] = { 0,1, 4,5, 8,9, 12,13, 2,3, 6,7, 10,11, 14,15}; - __m128i ld128; - ld128 = vld1q_u16(ptr); //merge two 64-bits in 128 bit - ld128 = _mm_shuffle_epi8(ld128, *(__m128i*)mask16_even_odd); - vst1q_u16((v.val), ld128); - return v; -} - -uint32x2x2_t vld2_u32(__transfersize(4) uint32_t const * ptr); // VLD2.32 {d0, d1}, [r0] -_NEON2SSE_INLINE uint32x2x2_t vld2_u32(__transfersize(4) uint32_t const * ptr) -{ - _NEON2SSE_ALIGN_16 uint32x2x2_t v; - __m128i ld128; - ld128 = vld1q_u32(ptr); //merge two 64-bits in 128 bit - ld128 = _mm_shuffle_epi32(ld128, 0 | (2 << 2) | (1 << 4) | (3 << 6)); - vst1q_u32((v.val), ld128); - return v; -} - -uint64x1x2_t vld2_u64(__transfersize(2) uint64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -_NEON2SSE_INLINE uint64x1x2_t vld2_u64(__transfersize(2) uint64_t const * ptr) -{ - uint64x1x2_t v; - v.val[0].m64_u64[0] = *(ptr); - v.val[1].m64_u64[0] = *(ptr + 1); - return v; -} - -int8x8x2_t vld2_s8(__transfersize(16) int8_t const * ptr); // VLD2.8 {d0, d1}, [r0] -#define vld2_s8(ptr) vld2_u8((uint8_t*)ptr) - -int16x4x2_t vld2_s16(__transfersize(8) int16_t const * ptr); // VLD2.16 {d0, d1}, [r0] -#define vld2_s16(ptr) vld2_u16((uint16_t*)ptr) - -int32x2x2_t vld2_s32(__transfersize(4) int32_t const * ptr); // VLD2.32 {d0, d1}, [r0] -#define vld2_s32(ptr) vld2_u32((uint32_t*)ptr) - -int64x1x2_t vld2_s64(__transfersize(2) int64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -#define vld2_s64(ptr) vld2_u64((uint64_t*)ptr) - -float16x4x2_t vld2_f16(__transfersize(8) __fp16 const * ptr); // VLD2.16 {d0, d1}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers. See vld1_f16 for example - -float32x2x2_t vld2_f32(__transfersize(4) float32_t const * ptr); // VLD2.32 {d0, d1}, [r0] -_NEON2SSE_INLINE float32x2x2_t vld2_f32(__transfersize(4) float32_t const * ptr) -{ - float32x2x2_t v; - v.val[0].m64_f32[0] = *(ptr); - v.val[0].m64_f32[1] = *(ptr + 2); - v.val[1].m64_f32[0] = *(ptr + 1); - v.val[1].m64_f32[1] = *(ptr + 3); - return v; -} - -poly8x8x2_t vld2_p8(__transfersize(16) poly8_t const * ptr); // VLD2.8 {d0, d1}, [r0] -#define vld2_p8 vld2_u8 - -poly16x4x2_t vld2_p16(__transfersize(8) poly16_t const * ptr); // VLD2.16 {d0, d1}, [r0] -#define vld2_p16 vld2_u16 - -//******************** Triplets *************************************** -//********************************************************************* -uint8x16x3_t vld3q_u8(__transfersize(48) uint8_t const * ptr); // VLD3.8 {d0, d2, d4}, [r0] -_NEON2SSE_INLINE uint8x16x3_t vld3q_u8(__transfersize(48) uint8_t const * ptr) // VLD3.8 {d0, d2, d4}, [r0] -{ - //a0,a1,a2,a3,...a7,a8,...a15, b0,b1,b2,...b7,b8,...b15, c0,c1,c2,...c7,c8,...c15 -> - //a:0,3,6,9,12,15,b:2,5,8,11,14, c:1,4,7,10,13 - //a:1,4,7,10,13, b:0,3,6,9,12,15,c:2,5,8,11,14, - //a:2,5,8,11,14, b:1,4,7,10,13, c:0,3,6,9,12,15 - uint8x16x3_t v; - __m128i tmp0, tmp1,tmp2, tmp3; - _NEON2SSE_ALIGN_16 int8_t mask8_0[16] = {0,3,6,9,12,15,1,4,7,10,13,2,5,8,11,14}; - _NEON2SSE_ALIGN_16 int8_t mask8_1[16] = {2,5,8,11,14,0,3,6,9,12,15,1,4,7,10,13}; - _NEON2SSE_ALIGN_16 int8_t mask8_2[16] = {1,4,7,10,13,2,5,8,11,14,0,3,6,9,12,15}; - - v.val[0] = vld1q_u8 (ptr); //a0,a1,a2,a3,...a7, ...a15 - v.val[1] = vld1q_u8 ((ptr + 16)); //b0,b1,b2,b3...b7, ...b15 - v.val[2] = vld1q_u8 ((ptr + 32)); //c0,c1,c2,c3,...c7,...c15 - - tmp0 = _mm_shuffle_epi8(v.val[0], *(__m128i*)mask8_0); //a:0,3,6,9,12,15,1,4,7,10,13,2,5,8,11 - tmp1 = _mm_shuffle_epi8(v.val[1], *(__m128i*)mask8_1); //b:2,5,8,11,14,0,3,6,9,12,15,1,4,7,10,13 - tmp2 = _mm_shuffle_epi8(v.val[2], *(__m128i*)mask8_2); //c:1,4,7,10,13,2,5,8,11,14,3,6,9,12,15 - - tmp3 = _mm_slli_si128(tmp0,10); //0,0,0,0,0,0,0,0,0,0,a0,a3,a6,a9,a12,a15 - tmp3 = _mm_alignr_epi8(tmp1,tmp3, 10); //a:0,3,6,9,12,15,b:2,5,8,11,14,x,x,x,x,x - tmp3 = _mm_slli_si128(tmp3, 5); //0,0,0,0,0,a:0,3,6,9,12,15,b:2,5,8,11,14, - tmp3 = _mm_srli_si128(tmp3, 5); //a:0,3,6,9,12,15,b:2,5,8,11,14,:0,0,0,0,0 - v.val[0] = _mm_slli_si128(tmp2, 11); //0,0,0,0,0,0,0,0,0,0,0,0, 1,4,7,10,13, - v.val[0] = _mm_or_si128(v.val[0],tmp3); //a:0,3,6,9,12,15,b:2,5,8,11,14,c:1,4,7,10,13, - - tmp3 = _mm_slli_si128(tmp0, 5); //0,0,0,0,0,a:0,3,6,9,12,15,1,4,7,10,13, - tmp3 = _mm_srli_si128(tmp3, 11); //a:1,4,7,10,13, 0,0,0,0,0,0,0,0,0,0,0 - v.val[1] = _mm_srli_si128(tmp1,5); //b:0,3,6,9,12,15,C:1,4,7,10,13, 0,0,0,0,0 - v.val[1] = _mm_slli_si128(v.val[1], 5); //0,0,0,0,0,b:0,3,6,9,12,15,C:1,4,7,10,13, - v.val[1] = _mm_or_si128(v.val[1],tmp3); //a:1,4,7,10,13,b:0,3,6,9,12,15,C:1,4,7,10,13, - v.val[1] = _mm_slli_si128(v.val[1],5); //0,0,0,0,0,a:1,4,7,10,13,b:0,3,6,9,12,15, - v.val[1] = _mm_srli_si128(v.val[1], 5); //a:1,4,7,10,13,b:0,3,6,9,12,15,0,0,0,0,0 - tmp3 = _mm_srli_si128(tmp2,5); //c:2,5,8,11,14,0,3,6,9,12,15,0,0,0,0,0 - tmp3 = _mm_slli_si128(tmp3,11); //0,0,0,0,0,0,0,0,0,0,0,c:2,5,8,11,14, - v.val[1] = _mm_or_si128(v.val[1],tmp3); //a:1,4,7,10,13,b:0,3,6,9,12,15,c:2,5,8,11,14, - - tmp3 = _mm_srli_si128(tmp2,10); //c:0,3,6,9,12,15, 0,0,0,0,0,0,0,0,0,0, - tmp3 = _mm_slli_si128(tmp3,10); //0,0,0,0,0,0,0,0,0,0, c:0,3,6,9,12,15, - v.val[2] = _mm_srli_si128(tmp1,11); //b:1,4,7,10,13,0,0,0,0,0,0,0,0,0,0,0 - v.val[2] = _mm_slli_si128(v.val[2],5); //0,0,0,0,0,b:1,4,7,10,13, 0,0,0,0,0,0 - v.val[2] = _mm_or_si128(v.val[2],tmp3); //0,0,0,0,0,b:1,4,7,10,13,c:0,3,6,9,12,15, - tmp0 = _mm_srli_si128(tmp0, 11); //a:2,5,8,11,14, 0,0,0,0,0,0,0,0,0,0,0, - v.val[2] = _mm_or_si128(v.val[2],tmp0); //a:2,5,8,11,14,b:1,4,7,10,13,c:0,3,6,9,12,15, - return v; -} - -uint16x8x3_t vld3q_u16(__transfersize(24) uint16_t const * ptr); // VLD3.16 {d0, d2, d4}, [r0] -_NEON2SSE_INLINE uint16x8x3_t vld3q_u16(__transfersize(24) uint16_t const * ptr) // VLD3.16 {d0, d2, d4}, [r0] -{ - //a0, a1,a2,a3,...a7, b0,b1,b2,b3,...b7, c0,c1,c2,c3...c7 -> a0,a3,a6,b1,b4,b7,c2,c5, a1,a4,a7,b2,b5,c0,c3,c6, a2,a5,b0,b3,b6,c1,c4,c7 - uint16x8x3_t v; - __m128i tmp0, tmp1,tmp2, tmp3; - _NEON2SSE_ALIGN_16 int8_t mask16_0[16] = {0,1, 6,7, 12,13, 2,3, 8,9, 14,15, 4,5, 10,11}; - _NEON2SSE_ALIGN_16 int8_t mask16_1[16] = {2,3, 8,9, 14,15, 4,5, 10,11, 0,1, 6,7, 12,13}; - _NEON2SSE_ALIGN_16 int8_t mask16_2[16] = {4,5, 10,11, 0,1, 6,7, 12,13, 2,3, 8,9, 14,15}; - - v.val[0] = vld1q_u16 (ptr); //a0,a1,a2,a3,...a7, - v.val[1] = vld1q_u16 ((ptr + 8)); //b0,b1,b2,b3...b7 - v.val[2] = vld1q_u16 ((ptr + 16)); //c0,c1,c2,c3,...c7 - - tmp0 = _mm_shuffle_epi8(v.val[0], *(__m128i*)mask16_0); //a0,a3,a6,a1,a4,a7,a2,a5, - tmp1 = _mm_shuffle_epi8(v.val[1], *(__m128i*)mask16_1); //b1,b4,b7,b2,b5,b0,b3,b6 - tmp2 = _mm_shuffle_epi8(v.val[2], *(__m128i*)mask16_2); //c2,c5, c0,c3,c6, c1,c4,c7 - - tmp3 = _mm_slli_si128(tmp0,10); //0,0,0,0,0,a0,a3,a6, - tmp3 = _mm_alignr_epi8(tmp1,tmp3, 10); //a0,a3,a6,b1,b4,b7,x,x - tmp3 = _mm_slli_si128(tmp3, 4); //0,0, a0,a3,a6,b1,b4,b7 - tmp3 = _mm_srli_si128(tmp3, 4); //a0,a3,a6,b1,b4,b7,0,0 - v.val[0] = _mm_slli_si128(tmp2, 12); //0,0,0,0,0,0, c2,c5, - v.val[0] = _mm_or_si128(v.val[0],tmp3); //a0,a3,a6,b1,b4,b7,c2,c5 - - tmp3 = _mm_slli_si128(tmp0, 4); //0,0,a0,a3,a6,a1,a4,a7 - tmp3 = _mm_srli_si128(tmp3,10); //a1,a4,a7, 0,0,0,0,0 - v.val[1] = _mm_srli_si128(tmp1,6); //b2,b5,b0,b3,b6,0,0 - v.val[1] = _mm_slli_si128(v.val[1], 6); //0,0,0,b2,b5,b0,b3,b6, - v.val[1] = _mm_or_si128(v.val[1],tmp3); //a1,a4,a7,b2,b5,b0,b3,b6, - v.val[1] = _mm_slli_si128(v.val[1],6); //0,0,0,a1,a4,a7,b2,b5, - v.val[1] = _mm_srli_si128(v.val[1], 6); //a1,a4,a7,b2,b5,0,0,0, - tmp3 = _mm_srli_si128(tmp2,4); //c0,c3,c6, c1,c4,c7,0,0 - tmp3 = _mm_slli_si128(tmp3,10); //0,0,0,0,0,c0,c3,c6, - v.val[1] = _mm_or_si128(v.val[1],tmp3); //a1,a4,a7,b2,b5,c0,c3,c6, - - tmp3 = _mm_srli_si128(tmp2,10); //c1,c4,c7, 0,0,0,0,0 - tmp3 = _mm_slli_si128(tmp3,10); //0,0,0,0,0, c1,c4,c7, - v.val[2] = _mm_srli_si128(tmp1,10); //b0,b3,b6,0,0, 0,0,0 - v.val[2] = _mm_slli_si128(v.val[2],4); //0,0, b0,b3,b6,0,0,0 - v.val[2] = _mm_or_si128(v.val[2],tmp3); //0,0, b0,b3,b6,c1,c4,c7, - tmp0 = _mm_srli_si128(tmp0, 12); //a2,a5,0,0,0,0,0,0 - v.val[2] = _mm_or_si128(v.val[2],tmp0); //a2,a5,b0,b3,b6,c1,c4,c7, - return v; -} - -uint32x4x3_t vld3q_u32(__transfersize(12) uint32_t const * ptr); // VLD3.32 {d0, d2, d4}, [r0] -_NEON2SSE_INLINE uint32x4x3_t vld3q_u32(__transfersize(12) uint32_t const * ptr) // VLD3.32 {d0, d2, d4}, [r0] -{ - //a0,a1,a2,a3, b0,b1,b2,b3, c0,c1,c2,c3 -> a0,a3,b2,c1, a1,b0,b3,c2, a2,b1,c0,c3, - uint32x4x3_t v; - __m128i tmp0, tmp1,tmp2, tmp3; - v.val[0] = vld1q_u32 (ptr); //a0,a1,a2,a3, - v.val[1] = vld1q_u32 ((ptr + 4)); //b0,b1,b2,b3 - v.val[2] = vld1q_u32 ((ptr + 8)); //c0,c1,c2,c3, - - tmp0 = _mm_shuffle_epi32(v.val[0], 0 | (3 << 2) | (1 << 4) | (2 << 6)); //a0,a3,a1,a2 - tmp1 = _mm_shuffle_epi32(v.val[1], _SWAP_HI_LOW32); //b2,b3,b0,b1 - tmp2 = _mm_shuffle_epi32(v.val[2], 1 | (2 << 2) | (0 << 4) | (3 << 6)); //c1,c2, c0,c3 - - tmp3 = _mm_unpacklo_epi32(tmp1, tmp2); //b2,c1, b3,c2 - v.val[0] = _mm_unpacklo_epi64(tmp0,tmp3); //a0,a3,b2,c1 - tmp0 = _mm_unpackhi_epi32(tmp0, tmp1); //a1,b0, a2,b1 - v.val[1] = _mm_shuffle_epi32(tmp0, _SWAP_HI_LOW32 ); //a2,b1, a1,b0, - v.val[1] = _mm_unpackhi_epi64(v.val[1], tmp3); //a1,b0, b3,c2 - v.val[2] = _mm_unpackhi_epi64(tmp0, tmp2); //a2,b1, c0,c3 - return v; -} - -int8x16x3_t vld3q_s8(__transfersize(48) int8_t const * ptr); // VLD3.8 {d0, d2, d4}, [r0] -#define vld3q_s8(ptr) vld3q_u8((uint8_t*) (ptr)) - -int16x8x3_t vld3q_s16(__transfersize(24) int16_t const * ptr); // VLD3.16 {d0, d2, d4}, [r0] -#define vld3q_s16(ptr) vld3q_u16((uint16_t*) (ptr)) - -int32x4x3_t vld3q_s32(__transfersize(12) int32_t const * ptr); // VLD3.32 {d0, d2, d4}, [r0] -#define vld3q_s32(ptr) vld3q_u32((uint32_t*) (ptr)) - -float16x8x3_t vld3q_f16(__transfersize(24) __fp16 const * ptr); // VLD3.16 {d0, d2, d4}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers. See vld1q_f16 for example - -float32x4x3_t vld3q_f32(__transfersize(12) float32_t const * ptr); // VLD3.32 {d0, d2, d4}, [r0] -_NEON2SSE_INLINE float32x4x3_t vld3q_f32(__transfersize(12) float32_t const * ptr) // VLD3.32 {d0, d2, d4}, [r0] -{ - //a0,a1,a2,a3, b0,b1,b2,b3, c0,c1,c2,c3 -> a0,a3,b2,c1, a1,b0,b3,c2, a2,b1,c0,c3, - float32x4x3_t v; - __m128 tmp0, tmp1,tmp2, tmp3; - v.val[0] = vld1q_f32 (ptr); //a0,a1,a2,a3, - v.val[1] = vld1q_f32 ((ptr + 4)); //b0,b1,b2,b3 - v.val[2] = vld1q_f32 ((ptr + 8)); //c0,c1,c2,c3, - - tmp0 = _mm_castsi128_ps(_mm_shuffle_epi32(_mm_castps_si128(v.val[0]), 0 | (3 << 2) | (1 << 4) | (2 << 6))); //a0,a3,a1,a2 - tmp1 = _mm_castsi128_ps(_mm_shuffle_epi32(_mm_castps_si128(v.val[1]), _SWAP_HI_LOW32)); //b2,b3,b0,b1 - tmp2 = _mm_castsi128_ps(_mm_shuffle_epi32(_mm_castps_si128(v.val[2]), 1 | (2 << 2) | (0 << 4) | (3 << 6))); //c1,c2, c0,c3 - tmp3 = _mm_unpacklo_ps(tmp1, tmp2); //b2,c1, b3,c2 - - v.val[0] = _mm_movelh_ps(tmp0,tmp3); //a0,a3,b2,c1 - tmp0 = _mm_unpackhi_ps(tmp0, tmp1); //a1,b0, a2,b1 - v.val[1] = _mm_castsi128_ps(_mm_shuffle_epi32(_mm_castps_si128(tmp0), _SWAP_HI_LOW32 )); //a2,b1, a1,b0, - v.val[1] = _mm_movehl_ps(tmp3,v.val[1]); //a1,b0, b3,c2 - v.val[2] = _mm_movehl_ps(tmp2,tmp0); //a2,b1, c0,c3 - return v; -} - -poly8x16x3_t vld3q_p8(__transfersize(48) poly8_t const * ptr); // VLD3.8 {d0, d2, d4}, [r0] -#define vld3q_p8 vld3q_u8 - -poly16x8x3_t vld3q_p16(__transfersize(24) poly16_t const * ptr); // VLD3.16 {d0, d2, d4}, [r0] -#define vld3q_p16 vld3q_u16 - -uint8x8x3_t vld3_u8(__transfersize(24) uint8_t const * ptr); // VLD3.8 {d0, d1, d2}, [r0] -_NEON2SSE_INLINE uint8x8x3_t vld3_u8(__transfersize(24) uint8_t const * ptr) // VLD3.8 {d0, d1, d2}, [r0] -{ - //a0, a1,a2,a3,...a7, b0,b1,b2,b3,...b7, c0,c1,c2,c3...c7 -> a0,a3,a6,b1,b4,b7,c2,c5, a1,a4,a7,b2,b5,c0,c3,c6, a2,a5,b0,b3,b6,c1,c4,c7 - uint8x8x3_t v; - __m128i val0, val1, val2, tmp0, tmp1; - _NEON2SSE_ALIGN_16 int8_t mask8_0[16] = {0,3,6,9,12,15, 1,4,7,10,13, 2,5,8,11,14}; - _NEON2SSE_ALIGN_16 int8_t mask8_1[16] = {2,5, 0,3,6, 1,4,7, 0,0,0,0,0,0,0,0}; - val0 = vld1q_u8 (ptr); //a0,a1,a2,a3,...a7, b0,b1,b2,b3...b7 - val2 = _mm_loadl_epi64((__m128i*)(ptr + 16)); //c0,c1,c2,c3,...c7 - - tmp0 = _mm_shuffle_epi8(val0, *(__m128i*)mask8_0); //a0,a3,a6,b1,b4,b7, a1,a4,a7,b2,b5, a2,a5,b0,b3,b6, - tmp1 = _mm_shuffle_epi8(val2, *(__m128i*)mask8_1); //c2,c5, c0,c3,c6, c1,c4,c7,x,x,x,x,x,x,x,x - val0 = _mm_slli_si128(tmp0,10); - val0 = _mm_srli_si128(val0,10); //a0,a3,a6,b1,b4,b7, 0,0,0,0,0,0,0,0,0,0 - val2 = _mm_slli_si128(tmp1,6); //0,0,0,0,0,0,c2,c5,x,x,x,x,x,x,x,x - val0 = _mm_or_si128(val0,val2); //a0,a3,a6,b1,b4,b7,c2,c5 x,x,x,x,x,x,x,x - _M64(v.val[0], val0); - val1 = _mm_slli_si128(tmp0,5); //0,0,0,0,0,0,0,0,0,0,0, a1,a4,a7,b2,b5, - val1 = _mm_srli_si128(val1,11); //a1,a4,a7,b2,b5,0,0,0,0,0,0,0,0,0,0,0, - val2 = _mm_srli_si128(tmp1,2); //c0,c3,c6,c1,c4,c7,x,x,x,x,x,x,x,x,0,0 - val2 = _mm_slli_si128(val2,5); //0,0,0,0,0,c0,c3,c6,0,0,0,0,0,0,0,0 - val1 = _mm_or_si128(val1,val2); //a1,a4,a7,b2,b5,c0,c3,c6,x,x,x,x,x,x,x,x - _M64(v.val[1], val1); - - tmp0 = _mm_srli_si128(tmp0,11); //a2,a5,b0,b3,b6,0,0,0,0,0,0,0,0,0,0,0, - val2 = _mm_srli_si128(tmp1,5); //c1,c4,c7,0,0,0,0,0,0,0,0,0,0,0,0,0 - val2 = _mm_slli_si128(val2,5); //0,0,0,0,0,c1,c4,c7, - val2 = _mm_or_si128(tmp0, val2); //a2,a5,b0,b3,b6,c1,c4,c7,x,x,x,x,x,x,x,x - _M64(v.val[2], val2); - return v; -} - -uint16x4x3_t vld3_u16(__transfersize(12) uint16_t const * ptr); // VLD3.16 {d0, d1, d2}, [r0] -_NEON2SSE_INLINE uint16x4x3_t vld3_u16(__transfersize(12) uint16_t const * ptr) // VLD3.16 {d0, d1, d2}, [r0] -{ - //a0,a1,a2,a3, b0,b1,b2,b3, c0,c1,c2,c3 -> a0,a3,b2,c1, a1,b0,b3,c2, a2,b1,c0,c3, - uint16x4x3_t v; - __m128i val0, val1, val2, tmp0, tmp1; - _NEON2SSE_ALIGN_16 int8_t mask16[16] = {0,1, 6,7, 12,13, 2,3, 8,9, 14,15, 4,5, 10,11}; - val0 = vld1q_u16 (ptr); //a0,a1,a2,a3, b0,b1,b2,b3 - val2 = _mm_loadl_epi64((__m128i*)(ptr + 8)); //c0,c1,c2,c3, x,x,x,x - - tmp0 = _mm_shuffle_epi8(val0, *(__m128i*)mask16); //a0, a3, b2,a1, b0, b3, a2, b1 - tmp1 = _mm_shufflelo_epi16(val2, 201); //11 00 10 01 : c1, c2, c0, c3, - val0 = _mm_slli_si128(tmp0,10); - val0 = _mm_srli_si128(val0,10); //a0, a3, b2, 0,0, 0,0, - val2 = _mm_slli_si128(tmp1,14); //0,0,0,0,0,0,0,c1 - val2 = _mm_srli_si128(val2,8); //0,0,0,c1,0,0,0,0 - val0 = _mm_or_si128(val0,val2); //a0, a3, b2, c1, x,x,x,x - _M64(v.val[0], val0); - - val1 = _mm_slli_si128(tmp0,4); //0,0,0,0,0,a1, b0, b3 - val1 = _mm_srli_si128(val1,10); //a1, b0, b3, 0,0, 0,0, - val2 = _mm_srli_si128(tmp1,2); //c2, 0,0,0,0,0,0,0, - val2 = _mm_slli_si128(val2,6); //0,0,0,c2,0,0,0,0 - val1 = _mm_or_si128(val1,val2); //a1, b0, b3, c2, x,x,x,x - _M64(v.val[1], val1); - - tmp0 = _mm_srli_si128(tmp0,12); //a2, b1,0,0,0,0,0,0 - tmp1 = _mm_srli_si128(tmp1,4); - tmp1 = _mm_slli_si128(tmp1,4); //0,0,c0, c3, - val2 = _mm_or_si128(tmp0, tmp1); //a2, b1, c0, c3, - _M64(v.val[2], val2); - return v; -} - -uint32x2x3_t vld3_u32(__transfersize(6) uint32_t const * ptr); // VLD3.32 {d0, d1, d2}, [r0] -_NEON2SSE_INLINE uint32x2x3_t vld3_u32(__transfersize(6) uint32_t const * ptr) // VLD3.32 {d0, d1, d2}, [r0] -{ - //a0,a1, b0,b1, c0,c1, -> a0,b1, a1,c0, b0,c1 - uint32x2x3_t v; - __m128i val0, val1, val2; - val0 = vld1q_u32 (ptr); //a0,a1, b0,b1, - val2 = _mm_loadl_epi64((__m128i*) (ptr + 4)); //c0,c1, x,x - - val0 = _mm_shuffle_epi32(val0, 0 | (3 << 2) | (1 << 4) | (2 << 6)); //a0,b1, a1, b0 - _M64(v.val[0], val0); - val2 = _mm_slli_si128(val2, 8); //x, x,c0,c1, - val1 = _mm_unpackhi_epi32(val0,val2); //a1,c0, b0, c1 - _M64(v.val[1], val1); - val2 = _mm_srli_si128(val1, 8); //b0, c1, x, x, - _M64(v.val[2], val2); - return v; -} -uint64x1x3_t vld3_u64(__transfersize(3) uint64_t const * ptr); // VLD1.64 {d0, d1, d2}, [r0] -_NEON2SSE_INLINE uint64x1x3_t vld3_u64(__transfersize(3) uint64_t const * ptr) // VLD1.64 {d0, d1, d2}, [r0] -{ - uint64x1x3_t v; - v.val[0].m64_u64[0] = *(ptr); - v.val[1].m64_u64[0] = *(ptr + 1); - v.val[2].m64_u64[0] = *(ptr + 2); - return v; -} - -int8x8x3_t vld3_s8(__transfersize(24) int8_t const * ptr); // VLD3.8 {d0, d1, d2}, [r0] -#define vld3_s8(ptr) vld3_u8((uint8_t*)ptr) - -int16x4x3_t vld3_s16(__transfersize(12) int16_t const * ptr); // VLD3.16 {d0, d1, d2}, [r0] -#define vld3_s16(ptr) vld3_u16((uint16_t*)ptr) - -int32x2x3_t vld3_s32(__transfersize(6) int32_t const * ptr); // VLD3.32 {d0, d1, d2}, [r0] -#define vld3_s32(ptr) vld3_u32((uint32_t*)ptr) - -int64x1x3_t vld3_s64(__transfersize(3) int64_t const * ptr); // VLD1.64 {d0, d1, d2}, [r0] -#define vld3_s64(ptr) vld3_u64((uint64_t*)ptr) - -float16x4x3_t vld3_f16(__transfersize(12) __fp16 const * ptr); // VLD3.16 {d0, d1, d2}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers. See vld1q_f16 for example - -float32x2x3_t vld3_f32(__transfersize(6) float32_t const * ptr); // VLD3.32 {d0, d1, d2}, [r0] -_NEON2SSE_INLINE float32x2x3_t vld3_f32(__transfersize(6) float32_t const * ptr) -{ - //a0,a1, b0,b1, c0,c1, -> a0,b1, a1,c0, b0,c1 - float32x2x3_t v; - v.val[0].m64_f32[0] = *(ptr); - v.val[0].m64_f32[1] = *(ptr + 3); - - v.val[1].m64_f32[0] = *(ptr + 1); - v.val[1].m64_f32[1] = *(ptr + 4); - - v.val[2].m64_f32[0] = *(ptr + 2); - v.val[2].m64_f32[1] = *(ptr + 5); - return v; -} - -poly8x8x3_t vld3_p8(__transfersize(24) poly8_t const * ptr); // VLD3.8 {d0, d1, d2}, [r0] -#define vld3_p8 vld3_u8 - -poly16x4x3_t vld3_p16(__transfersize(12) poly16_t const * ptr); // VLD3.16 {d0, d1, d2}, [r0] -#define vld3_p16 vld3_u16 - -//*************** Quadruples load ******************************** -//***************************************************************** -uint8x16x4_t vld4q_u8(__transfersize(64) uint8_t const * ptr); // VLD4.8 {d0, d2, d4, d6}, [r0] -_NEON2SSE_INLINE uint8x16x4_t vld4q_u8(__transfersize(64) uint8_t const * ptr) // VLD4.8 {d0, d2, d4, d6}, [r0] -{ - uint8x16x4_t v; - __m128i tmp3, tmp2, tmp1, tmp0; - - v.val[0] = vld1q_u8 ( ptr); //a0,a1,a2,...a7, ...a15 - v.val[1] = vld1q_u8 ( (ptr + 16)); //b0, b1,b2,...b7.... b15 - v.val[2] = vld1q_u8 ( (ptr + 32)); //c0, c1,c2,...c7....c15 - v.val[3] = vld1q_u8 ( (ptr + 48)); //d0,d1,d2,...d7....d15 - - tmp0 = _mm_unpacklo_epi8(v.val[0],v.val[1]); //a0,b0, a1,b1, a2,b2, a3,b3,....a7,b7 - tmp1 = _mm_unpacklo_epi8(v.val[2],v.val[3]); //c0,d0, c1,d1, c2,d2, c3,d3,... c7,d7 - tmp2 = _mm_unpackhi_epi8(v.val[0],v.val[1]); //a8,b8, a9,b9, a10,b10, a11,b11,...a15,b15 - tmp3 = _mm_unpackhi_epi8(v.val[2],v.val[3]); //c8,d8, c9,d9, c10,d10, c11,d11,...c15,d15 - - v.val[0] = _mm_unpacklo_epi8(tmp0, tmp2); //a0,a8, b0,b8, a1,a9, b1,b9, ....a3,a11, b3,b11 - v.val[1] = _mm_unpackhi_epi8(tmp0, tmp2); //a4,a12, b4,b12, a5,a13, b5,b13,....a7,a15,b7,b15 - v.val[2] = _mm_unpacklo_epi8(tmp1, tmp3); //c0,c8, d0,d8, c1,c9, d1,d9.....d3,d11 - v.val[3] = _mm_unpackhi_epi8(tmp1, tmp3); //c4,c12,d4,d12, c5,c13, d5,d13,....d7,d15 - - tmp0 = _mm_unpacklo_epi32(v.val[0], v.val[2] ); ///a0,a8, b0,b8, c0,c8, d0,d8, a1,a9, b1,b9, c1,c9, d1,d9 - tmp1 = _mm_unpackhi_epi32(v.val[0], v.val[2] ); //a2,a10, b2,b10, c2,c10, d2,d10, a3,a11, b3,b11, c3,c11, d3,d11 - tmp2 = _mm_unpacklo_epi32(v.val[1], v.val[3] ); //a4,a12, b4,b12, c4,c12, d4,d12, a5,a13, b5,b13, c5,c13, d5,d13, - tmp3 = _mm_unpackhi_epi32(v.val[1], v.val[3] ); //a6,a14, b6,b14, c6,c14, d6,d14, a7,a15,b7,b15,c7,c15,d7,d15 - - v.val[0] = _mm_unpacklo_epi8(tmp0, tmp2); //a0,a4,a8,a12,b0,b4,b8,b12,c0,c4,c8,c12,d0,d4,d8,d12 - v.val[1] = _mm_unpackhi_epi8(tmp0, tmp2); //a1,a5, a9, a13, b1,b5, b9,b13, c1,c5, c9, c13, d1,d5, d9,d13 - v.val[2] = _mm_unpacklo_epi8(tmp1, tmp3); //a2,a6, a10,a14, b2,b6, b10,b14,c2,c6, c10,c14, d2,d6, d10,d14 - v.val[3] = _mm_unpackhi_epi8(tmp1, tmp3); //a3,a7, a11,a15, b3,b7, b11,b15,c3,c7, c11, c15,d3,d7, d11,d15 - return v; -} - -uint16x8x4_t vld4q_u16(__transfersize(32) uint16_t const * ptr); // VLD4.16 {d0, d2, d4, d6}, [r0] -_NEON2SSE_INLINE uint16x8x4_t vld4q_u16(__transfersize(32) uint16_t const * ptr) // VLD4.16 {d0, d2, d4, d6}, [r0] -{ - uint16x8x4_t v; - __m128i tmp3, tmp2, tmp1, tmp0; - tmp0 = vld1q_u16 (ptr); //a0,a1,a2,...a7 - tmp1 = vld1q_u16 ((ptr + 8)); //b0, b1,b2,...b7 - tmp2 = vld1q_u16 ((ptr + 16)); //c0, c1,c2,...c7 - tmp3 = vld1q_u16 ((ptr + 24)); //d0,d1,d2,...d7 - v.val[0] = _mm_unpacklo_epi16(tmp0,tmp1); //a0,b0, a1,b1, a2,b2, a3,b3, - v.val[1] = _mm_unpacklo_epi16(tmp2,tmp3); //c0,d0, c1,d1, c2,d2, c3,d3, - v.val[2] = _mm_unpackhi_epi16(tmp0,tmp1); //a4,b4, a5,b5, a6,b6, a7,b7 - v.val[3] = _mm_unpackhi_epi16(tmp2,tmp3); //c4,d4, c5,d5, c6,d6, c7,d7 - tmp0 = _mm_unpacklo_epi16(v.val[0], v.val[2]); //a0,a4, b0,b4, a1,a5, b1,b5 - tmp1 = _mm_unpackhi_epi16(v.val[0], v.val[2]); //a2,a6, b2,b6, a3,a7, b3,b7 - tmp2 = _mm_unpacklo_epi16(v.val[1], v.val[3]); //c0,c4, d0,d4, c1,c5, d1,d5 - tmp3 = _mm_unpackhi_epi16(v.val[1], v.val[3]); //c2,c6, d2,d6, c3,c7, d3,d7 - v.val[0] = _mm_unpacklo_epi64(tmp0, tmp2); //a0,a4, b0,b4, c0,c4, d0,d4, - v.val[1] = _mm_unpackhi_epi64(tmp0, tmp2); //a1,a5, b1,b5, c1,c5, d1,d5 - v.val[2] = _mm_unpacklo_epi64(tmp1, tmp3); //a2,a6, b2,b6, c2,c6, d2,d6, - v.val[3] = _mm_unpackhi_epi64(tmp1, tmp3); //a3,a7, b3,b7, c3,c7, d3,d7 - return v; -} - -uint32x4x4_t vld4q_u32(__transfersize(16) uint32_t const * ptr); // VLD4.32 {d0, d2, d4, d6}, [r0] -_NEON2SSE_INLINE uint32x4x4_t vld4q_u32(__transfersize(16) uint32_t const * ptr) // VLD4.32 {d0, d2, d4, d6}, [r0] -{ - uint32x4x4_t v; - __m128i tmp3, tmp2, tmp1, tmp0; - v.val[0] = vld1q_u32 (ptr); - v.val[1] = vld1q_u32 ((ptr + 4)); - v.val[2] = vld1q_u32 ((ptr + 8)); - v.val[3] = vld1q_u32 ((ptr + 12)); - tmp0 = _mm_unpacklo_epi32(v.val[0],v.val[1]); - tmp1 = _mm_unpacklo_epi32(v.val[2],v.val[3]); - tmp2 = _mm_unpackhi_epi32(v.val[0],v.val[1]); - tmp3 = _mm_unpackhi_epi32(v.val[2],v.val[3]); - v.val[0] = _mm_unpacklo_epi64(tmp0, tmp1); - v.val[1] = _mm_unpackhi_epi64(tmp0, tmp1); - v.val[2] = _mm_unpacklo_epi64(tmp2, tmp3); - v.val[3] = _mm_unpackhi_epi64(tmp2, tmp3); - return v; -} - -int8x16x4_t vld4q_s8(__transfersize(64) int8_t const * ptr); // VLD4.8 {d0, d2, d4, d6}, [r0] -#define vld4q_s8(ptr) vld4q_u8((uint8_t*)ptr) - -int16x8x4_t vld4q_s16(__transfersize(32) int16_t const * ptr); // VLD4.16 {d0, d2, d4, d6}, [r0] -#define vld4q_s16(ptr) vld4q_u16((uint16_t*)ptr) - -int32x4x4_t vld4q_s32(__transfersize(16) int32_t const * ptr); // VLD4.32 {d0, d2, d4, d6}, [r0] -#define vld4q_s32(ptr) vld4q_u32((uint32_t*)ptr) - -float16x8x4_t vld4q_f16(__transfersize(32) __fp16 const * ptr); // VLD4.16 {d0, d2, d4, d6}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers. See vld1q_f16 for example - -float32x4x4_t vld4q_f32(__transfersize(16) float32_t const * ptr); // VLD4.32 {d0, d2, d4, d6}, [r0] -_NEON2SSE_INLINE float32x4x4_t vld4q_f32(__transfersize(16) float32_t const * ptr) // VLD4.32 {d0, d2, d4, d6}, [r0] -{ - float32x4x4_t v; - __m128 tmp3, tmp2, tmp1, tmp0; - - v.val[0] = vld1q_f32 ((float*) ptr); - v.val[1] = vld1q_f32 ((float*) (ptr + 4)); - v.val[2] = vld1q_f32 ((float*) (ptr + 8)); - v.val[3] = vld1q_f32 ((float*) (ptr + 12)); - tmp0 = _mm_unpacklo_ps(v.val[0], v.val[1]); - tmp2 = _mm_unpacklo_ps(v.val[2], v.val[3]); - tmp1 = _mm_unpackhi_ps(v.val[0], v.val[1]); - tmp3 = _mm_unpackhi_ps(v.val[2], v.val[3]); - v.val[0] = _mm_movelh_ps(tmp0, tmp2); - v.val[1] = _mm_movehl_ps(tmp2, tmp0); - v.val[2] = _mm_movelh_ps(tmp1, tmp3); - v.val[3] = _mm_movehl_ps(tmp3, tmp1); - return v; -} - -poly8x16x4_t vld4q_p8(__transfersize(64) poly8_t const * ptr); // VLD4.8 {d0, d2, d4, d6}, [r0] -#define vld4q_p8 vld4q_u8 - -poly16x8x4_t vld4q_p16(__transfersize(32) poly16_t const * ptr); // VLD4.16 {d0, d2, d4, d6}, [r0] -#define vld4q_p16 vld4q_s16 - -uint8x8x4_t vld4_u8(__transfersize(32) uint8_t const * ptr); // VLD4.8 {d0, d1, d2, d3}, [r0] -_NEON2SSE_INLINE uint8x8x4_t vld4_u8(__transfersize(32) uint8_t const * ptr) // VLD4.8 {d0, d1, d2, d3}, [r0] -{ - uint8x8x4_t v; - __m128i sh0, sh1; - __m128i val0, val2; - _NEON2SSE_ALIGN_16 int8_t mask4_8[16] = {0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15}; - - val0 = vld1q_u8(( ptr)); //load first 64-bits in val[0] and val[1] - val2 = vld1q_u8(( ptr + 16)); //load third and forth 64-bits in val[2], val[3] - - sh0 = _mm_shuffle_epi8(val0, *(__m128i*)mask4_8); - sh1 = _mm_shuffle_epi8(val2, *(__m128i*)mask4_8); - val0 = _mm_unpacklo_epi32(sh0,sh1); //0,4,8,12,16,20,24,28, 1,5,9,13,17,21,25,29 - vst1q_u8(&v.val[0], val0 ); - val2 = _mm_unpackhi_epi32(sh0,sh1); //2,6,10,14,18,22,26,30, 3,7,11,15,19,23,27,31 - vst1q_u8(&v.val[2], val2 ); - return v; -} - -uint16x4x4_t vld4_u16(__transfersize(16) uint16_t const * ptr); // VLD4.16 {d0, d1, d2, d3}, [r0] -_NEON2SSE_INLINE uint16x4x4_t vld4_u16(__transfersize(16) uint16_t const * ptr) // VLD4.16 {d0, d1, d2, d3}, [r0] -{ - uint16x4x4_t v; - __m128i sh0, sh1; - __m128i val0, val2; - _NEON2SSE_ALIGN_16 int8_t mask4_16[16] = {0,1, 8,9, 2,3, 10,11, 4,5, 12,13, 6,7, 14,15}; //0, 4, 1, 5, 2, 6, 3, 7 - val0 = vld1q_u16 ( (ptr)); //load first 64-bits in val[0] and val[1] - val2 = vld1q_u16 ( (ptr + 8)); //load third and forth 64-bits in val[2], val[3] - sh0 = _mm_shuffle_epi8(val0, *(__m128i*)mask4_16); - sh1 = _mm_shuffle_epi8(val2, *(__m128i*)mask4_16); - val0 = _mm_unpacklo_epi32(sh0,sh1); //0,4,8,12, 1,5,9,13 - vst1q_u16(&v.val[0], val0 ); - val2 = _mm_unpackhi_epi32(sh0,sh1); //2,6,10,14, 3,7,11,15 - vst1q_u16(&v.val[2], val2 ); - return v; -} - -uint32x2x4_t vld4_u32(__transfersize(8) uint32_t const * ptr); // VLD4.32 {d0, d1, d2, d3}, [r0] -_NEON2SSE_INLINE uint32x2x4_t vld4_u32(__transfersize(8) uint32_t const * ptr) -{ - //a0,a1, b0,b1, c0,c1, d0,d1 -> a0,c0, a1,c1, b0,d0, b1,d1 - uint32x2x4_t v; - __m128i val0, val01, val2; - val0 = vld1q_u32 (ptr); //a0,a1, b0,b1, - val2 = vld1q_u32 ((ptr + 4)); //c0,c1, d0,d1 - val01 = _mm_unpacklo_epi32(val0,val2); //a0, c0, a1,c1, - val2 = _mm_unpackhi_epi32(val0,val2); //b0,d0, b1, d1 - vst1q_u32(&v.val[0], val01); - vst1q_u32(&v.val[2], val2 ); - return v; -} - -uint64x1x4_t vld4_u64(__transfersize(4) uint64_t const * ptr); // VLD1.64 {d0, d1, d2, d3}, [r0] -_NEON2SSE_INLINE uint64x1x4_t vld4_u64(__transfersize(4) uint64_t const * ptr) // VLD1.64 {d0, d1, d2, d3}, [r0] -{ - uint64x1x4_t v; - v.val[0].m64_u64[0] = *(ptr); //load first 64-bits in val[0] and val[1] - v.val[1].m64_u64[0] = *(ptr + 1); //load first 64-bits in val[0] and val[1] - v.val[2].m64_u64[0] = *(ptr + 2); //load third and forth 64-bits in val[2], val[3] - v.val[3].m64_u64[0] = *(ptr + 3); //load third and forth 64-bits in val[2], val[3] - return v; -} - -int8x8x4_t vld4_s8(__transfersize(32) int8_t const * ptr); // VLD4.8 {d0, d1, d2, d3}, [r0] -#define vld4_s8(ptr) vld4_u8((uint8_t*)ptr) - -int16x4x4_t vld4_s16(__transfersize(16) int16_t const * ptr); // VLD4.16 {d0, d1, d2, d3}, [r0] -#define vld4_s16(ptr) vld4_u16((uint16_t*)ptr) - -int32x2x4_t vld4_s32(__transfersize(8) int32_t const * ptr); // VLD4.32 {d0, d1, d2, d3}, [r0] -#define vld4_s32(ptr) vld4_u32((uint32_t*)ptr) - -int64x1x4_t vld4_s64(__transfersize(4) int64_t const * ptr); // VLD1.64 {d0, d1, d2, d3}, [r0] -#define vld4_s64(ptr) vld4_u64((uint64_t*)ptr) - -float16x4x4_t vld4_f16(__transfersize(16) __fp16 const * ptr); // VLD4.16 {d0, d1, d2, d3}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers. See vld1q_f16 for example - -float32x2x4_t vld4_f32(__transfersize(8) float32_t const * ptr); // VLD4.32 {d0, d1, d2, d3}, [r0] -_NEON2SSE_INLINE float32x2x4_t vld4_f32(__transfersize(8) float32_t const * ptr) // VLD4.32 {d0, d1, d2, d3}, [r0] -{ - //a0,a1, b0,b1, c0,c1, d0,d1 -> a0,c0, a1,c1, b0,d0, b1,d1 - float32x2x4_t res; - res.val[0].m64_f32[0] = *(ptr); - res.val[0].m64_f32[1] = *(ptr + 4); - res.val[1].m64_f32[0] = *(ptr + 1); - res.val[1].m64_f32[1] = *(ptr + 5); - res.val[2].m64_f32[0] = *(ptr + 2); - res.val[2].m64_f32[1] = *(ptr + 6); - res.val[3].m64_f32[0] = *(ptr + 3); - res.val[3].m64_f32[1] = *(ptr + 7); - return res; -} - -poly8x8x4_t vld4_p8(__transfersize(32) poly8_t const * ptr); // VLD4.8 {d0, d1, d2, d3}, [r0] -#define vld4_p8 vld4_u8 - -poly16x4x4_t vld4_p16(__transfersize(16) poly16_t const * ptr); // VLD4.16 {d0, d1, d2, d3}, [r0] -#define vld4_p16 vld4_u16 - -//************* Duplicate (or propagate) ptr[0] to all val[0] lanes and ptr[1] to all val[1] lanes ******************* -//******************************************************************************************************************* -uint8x8x2_t vld2_dup_u8(__transfersize(2) uint8_t const * ptr); // VLD2.8 {d0[], d1[]}, [r0] -_NEON2SSE_INLINE uint8x8x2_t vld2_dup_u8(__transfersize(2) uint8_t const * ptr) // VLD2.8 {d0[], d1[]}, [r0] -{ - uint8x8x2_t v; - __m128i val0, val1; - val0 = LOAD_SI128(ptr); //0,1,x,x, x,x,x,x,x,x,x,x, x,x,x,x - val1 = _mm_unpacklo_epi8(val0,val0); //0,0,1,1,x,x,x,x, x,x,x,x,x,x,x,x, - val1 = _mm_unpacklo_epi16(val1,val1); //0,0,0,0, 1,1,1,1,x,x,x,x, x,x,x,x - val0 = _mm_unpacklo_epi32(val1,val1); //0,0,0,0, 0,0,0,0,1,1,1,1,1,1,1,1, - vst1q_u8(v.val, val0); - return v; -} - -uint16x4x2_t vld2_dup_u16(__transfersize(2) uint16_t const * ptr); // VLD2.16 {d0[], d1[]}, [r0] -_NEON2SSE_INLINE uint16x4x2_t vld2_dup_u16(__transfersize(2) uint16_t const * ptr) // VLD2.16 {d0[], d1[]}, [r0] -{ - uint16x4x2_t v; - __m128i val0, val1; - val1 = LOAD_SI128(ptr); //0,1,x,x, x,x,x,x - val0 = _mm_shufflelo_epi16(val1, 0); //00 00 00 00 (all 0) - _M64(v.val[0], val0); - val1 = _mm_shufflelo_epi16(val1, 85); //01 01 01 01 (all 1) - _M64(v.val[1], val1); - return v; -} - -uint32x2x2_t vld2_dup_u32(__transfersize(2) uint32_t const * ptr); // VLD2.32 {d0[], d1[]}, [r0] -_NEON2SSE_INLINE uint32x2x2_t vld2_dup_u32(__transfersize(2) uint32_t const * ptr) // VLD2.32 {d0[], d1[]}, [r0] -{ - uint32x2x2_t v; - __m128i val0; - val0 = LOAD_SI128(ptr); //0,1,x,x - val0 = _mm_shuffle_epi32(val0, 0 | (0 << 2) | (1 << 4) | (1 << 6)); //0,0,1,1 - vst1q_u32(v.val, val0); - return v; -} - -uint64x1x2_t vld2_dup_u64(__transfersize(2) uint64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -#define vld2_dup_u64 vld2_u64 - -int8x8x2_t vld2_dup_s8(__transfersize(2) int8_t const * ptr); // VLD2.8 {d0[], d1[]}, [r0] -#define vld2_dup_s8(ptr) vld2_dup_u8((uint8_t*)ptr) - -int16x4x2_t vld2_dup_s16(__transfersize(2) int16_t const * ptr); // VLD2.16 {d0[], d1[]}, [r0] -#define vld2_dup_s16(ptr) vld2_dup_u16((uint16_t*)ptr) - -int32x2x2_t vld2_dup_s32(__transfersize(2) int32_t const * ptr); // VLD2.32 {d0[], d1[]}, [r0] -#define vld2_dup_s32(ptr) vld2_dup_u32((uint32_t*)ptr) - -int64x1x2_t vld2_dup_s64(__transfersize(2) int64_t const * ptr); // VLD1.64 {d0, d1}, [r0] -#define vld2_dup_s64(ptr) vld2_dup_u64((uint64_t*)ptr) - -float16x4x2_t vld2_dup_f16(__transfersize(2) __fp16 const * ptr); // VLD2.16 {d0[], d1[]}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers. See vld1q_f16 for example - -float32x2x2_t vld2_dup_f32(__transfersize(2) float32_t const * ptr); // VLD2.32 {d0[], d1[]}, [r0] -_NEON2SSE_INLINE float32x2x2_t vld2_dup_f32(__transfersize(2) float32_t const * ptr) // VLD2.32 {d0[], d1[]}, [r0] -{ - float32x2x2_t v; - v.val[0].m64_f32[0] = *(ptr); //0,0 - v.val[0].m64_f32[1] = *(ptr); //0,0 - v.val[1].m64_f32[0] = *(ptr + 1); //1,1 - v.val[1].m64_f32[1] = *(ptr + 1); //1,1 - return v; -} - -poly8x8x2_t vld2_dup_p8(__transfersize(2) poly8_t const * ptr); // VLD2.8 {d0[], d1[]}, [r0] -#define vld2_dup_p8 vld2_dup_u8 - -poly16x4x2_t vld2_dup_p16(__transfersize(2) poly16_t const * ptr); // VLD2.16 {d0[], d1[]}, [r0] -#define vld2_dup_p16 vld2_dup_s16 - -//************* Duplicate (or propagate)triplets: ******************* -//******************************************************************** -//ptr[0] to all val[0] lanes, ptr[1] to all val[1] lanes and ptr[2] to all val[2] lanes -uint8x8x3_t vld3_dup_u8(__transfersize(3) uint8_t const * ptr); // VLD3.8 {d0[], d1[], d2[]}, [r0] -_NEON2SSE_INLINE uint8x8x3_t vld3_dup_u8(__transfersize(3) uint8_t const * ptr) // VLD3.8 {d0[], d1[], d2[]}, [r0] -{ - uint8x8x3_t v; - __m128i val0, val1, val2; - val0 = LOAD_SI128(ptr); //0,1,2,x, x,x,x,x,x,x,x,x, x,x,x,x - val1 = _mm_unpacklo_epi8(val0,val0); //0,0,1,1,2,2,x,x, x,x,x,x,x,x,x,x, - val1 = _mm_unpacklo_epi16(val1,val1); //0,0,0,0, 1,1,1,1,2,2,2,2,x,x,x,x, - val0 = _mm_unpacklo_epi32(val1,val1); //0,0,0,0, 0,0,0,0,1,1,1,1,1,1,1,1, - val2 = _mm_unpackhi_epi32(val1,val1); // 2,2,2,2,2,2,2,2, x,x,x,x,x,x,x,x, - vst1q_u8(v.val, val0); - _M64(v.val[2], val2); - return v; -} - -uint16x4x3_t vld3_dup_u16(__transfersize(3) uint16_t const * ptr); // VLD3.16 {d0[], d1[], d2[]}, [r0] -_NEON2SSE_INLINE uint16x4x3_t vld3_dup_u16(__transfersize(3) uint16_t const * ptr) // VLD3.16 {d0[], d1[], d2[]}, [r0] -{ - uint16x4x3_t v; - __m128i val0, val1, val2; - val2 = LOAD_SI128(ptr); //0,1,2,x, x,x,x,x - val0 = _mm_shufflelo_epi16(val2, 0); //00 00 00 00 (all 0) - val1 = _mm_shufflelo_epi16(val2, 85); //01 01 01 01 (all 1) - val2 = _mm_shufflelo_epi16(val2, 170); //10 10 10 10 (all 2) - _M64(v.val[0], val0); - _M64(v.val[1], val1); - _M64(v.val[2], val2); - return v; -} - -uint32x2x3_t vld3_dup_u32(__transfersize(3) uint32_t const * ptr); // VLD3.32 {d0[], d1[], d2[]}, [r0] -_NEON2SSE_INLINE uint32x2x3_t vld3_dup_u32(__transfersize(3) uint32_t const * ptr) // VLD3.32 {d0[], d1[], d2[]}, [r0] -{ - uint32x2x3_t v; - __m128i val0, val1, val2; - val2 = LOAD_SI128(ptr); //0,1,2,x - val0 = _mm_shuffle_epi32(val2, 0 | (0 << 2) | (2 << 4) | (2 << 6)); //0,0,2,2 - val1 = _mm_shuffle_epi32(val2, 1 | (1 << 2) | (2 << 4) | (2 << 6)); //1,1,2,2 - val2 = _mm_srli_si128(val0, 8); //2,2,0x0,0x0 - _M64(v.val[0], val0); - _M64(v.val[1], val1); - _M64(v.val[2], val2); - return v; -} - -uint64x1x3_t vld3_dup_u64(__transfersize(3) uint64_t const * ptr); // VLD1.64 {d0, d1, d2}, [r0] -_NEON2SSE_INLINE uint64x1x3_t vld3_dup_u64(__transfersize(3) uint64_t const * ptr) // VLD1.64 {d0, d1, d2}, [r0] -{ - uint64x1x3_t v; - v.val[0].m64_u64[0] = *(ptr); - v.val[1].m64_u64[0] = *(ptr + 1); - v.val[2].m64_u64[0] = *(ptr + 2); - return v; -} - -int8x8x3_t vld3_dup_s8(__transfersize(3) int8_t const * ptr); // VLD3.8 {d0[], d1[], d2[]}, [r0] -#define vld3_dup_s8(ptr) vld3_dup_u8((uint8_t*)ptr) - -int16x4x3_t vld3_dup_s16(__transfersize(3) int16_t const * ptr); // VLD3.16 {d0[], d1[], d2[]}, [r0] -#define vld3_dup_s16(ptr) vld3_dup_u16((uint16_t*)ptr) - -int32x2x3_t vld3_dup_s32(__transfersize(3) int32_t const * ptr); // VLD3.32 {d0[], d1[], d2[]}, [r0] -#define vld3_dup_s32(ptr) vld3_dup_u32((uint32_t*)ptr) - -int64x1x3_t vld3_dup_s64(__transfersize(3) int64_t const * ptr); // VLD1.64 {d0, d1, d2}, [r0] -#define vld3_dup_s64(ptr) vld3_dup_u64((uint64_t*)ptr) - - -float16x4x3_t vld3_dup_f16(__transfersize(3) __fp16 const * ptr); // VLD3.16 {d0[], d1[], d2[]}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers. See vld1q_f16 for example - -float32x2x3_t vld3_dup_f32(__transfersize(3) float32_t const * ptr); // VLD3.32 {d0[], d1[], d2[]}, [r0] -_NEON2SSE_INLINE float32x2x3_t vld3_dup_f32(__transfersize(3) float32_t const * ptr) // VLD3.32 {d0[], d1[], d2[]}, [r0] -{ - float32x2x3_t v; - int i; - for (i = 0; i<3; i++) { - v.val[i].m64_f32[0] = *(ptr + i); - v.val[i].m64_f32[1] = *(ptr + i); - } - return v; -} - -poly8x8x3_t vld3_dup_p8(__transfersize(3) poly8_t const * ptr); // VLD3.8 {d0[], d1[], d2[]}, [r0] -#define vld3_dup_p8 vld3_dup_u8 - -poly16x4x3_t vld3_dup_p16(__transfersize(3) poly16_t const * ptr); // VLD3.16 {d0[], d1[], d2[]}, [r0] -#define vld3_dup_p16 vld3_dup_s16 - - -//************* Duplicate (or propagate) quadruples: ******************* -//*********************************************************************** -//ptr[0] to all val[0] lanes, ptr[1] to all val[1] lanes, ptr[2] to all val[2] lanes and ptr[3] to all val[3] lanes -uint8x8x4_t vld4_dup_u8(__transfersize(4) uint8_t const * ptr); // VLD4.8 {d0[], d1[], d2[], d3[]}, [r0] -_NEON2SSE_INLINE uint8x8x4_t vld4_dup_u8(__transfersize(4) uint8_t const * ptr) // VLD4.8 {d0[], d1[], d2[], d3[]}, [r0] -{ - uint8x8x4_t v; - __m128i val0, val1, val2; - val0 = LOAD_SI128(ptr); //0,1,2,3, x,x,x,x,x,x,x,x, x,x,x,x - val1 = _mm_unpacklo_epi8(val0,val0); //0,0,1,1,2,2,3,3, x,x,x,x,x,x,x,x, - val1 = _mm_unpacklo_epi16(val1,val1); //0,0,0,0, 1,1,1,1,2,2,2,2,3,3,3,3 - val0 = _mm_unpacklo_epi32(val1,val1); //0,0,0,0, 0,0,0,0,1,1,1,1,1,1,1,1, - val2 = _mm_unpackhi_epi32(val1,val1); // 2,2,2,2,2,2,2,2, 3,3,3,3, 3,3,3,3 - vst1q_u8(&v.val[0], val0); - vst1q_u8(&v.val[2], val2); - return v; -} - -uint16x4x4_t vld4_dup_u16(__transfersize(4) uint16_t const * ptr); // VLD4.16 {d0[], d1[], d2[], d3[]}, [r0] -_NEON2SSE_INLINE uint16x4x4_t vld4_dup_u16(__transfersize(4) uint16_t const * ptr) // VLD4.16 {d0[], d1[], d2[], d3[]}, [r0] -{ - uint16x4x4_t v; - __m128i val0, val1, val2, val3; - val3 = LOAD_SI128(ptr); //0,1,2,3, x,x,x,x - val0 = _mm_shufflelo_epi16(val3, 0); //00 00 00 00 (all 0) - val1 = _mm_shufflelo_epi16(val3, 85); //01 01 01 01 (all 1) - val2 = _mm_shufflelo_epi16(val3, 170); //10 10 10 10 (all 2) - val3 = _mm_shufflelo_epi16(val3, 255); //11 11 11 11 (all 3) - _M64(v.val[0], val0); - _M64(v.val[1], val1); - _M64(v.val[2], val2); - _M64(v.val[3], val3); - return v; -} - -uint32x2x4_t vld4_dup_u32(__transfersize(4) uint32_t const * ptr); // VLD4.32 {d0[], d1[], d2[], d3[]}, [r0] -_NEON2SSE_INLINE uint32x2x4_t vld4_dup_u32(__transfersize(4) uint32_t const * ptr) // VLD4.32 {d0[], d1[], d2[], d3[]}, [r0] -{ - uint32x2x4_t v; - __m128i val0, val1, val2, val3; - val3 = LOAD_SI128(ptr); //0,1,2,3 - val0 = _mm_shuffle_epi32(val3, 0 | (0 << 2) | (2 << 4) | (3 << 6)); //0,0,2,3 - val1 = _mm_shuffle_epi32(val3, 1 | (1 << 2) | (2 << 4) | (3 << 6)); //1,1,2,3 - val2 = _mm_shuffle_epi32(val3, 2 | (2 << 2) | (3 << 4) | (3 << 6)); //2,2,3,3 - val3 = _mm_shuffle_epi32(val3, 3 | (3 << 2) | (3 << 4) | (3 << 6)); //3,3,2,2 - _M64(v.val[0], val0); - _M64(v.val[1], val1); - _M64(v.val[2], val2); - _M64(v.val[3], val3); - return v; -} - -uint64x1x4_t vld4_dup_u64(__transfersize(4) uint64_t const * ptr); // VLD1.64 {d0, d1, d2, d3}, [r0] -_NEON2SSE_INLINE uint64x1x4_t vld4_dup_u64(__transfersize(4) uint64_t const * ptr) // VLD1.64 {d0, d1, d2, d3}, [r0] -{ - uint64x1x4_t v; - v.val[0].m64_u64[0] = *(ptr); - v.val[1].m64_u64[0] = *(ptr + 1); - v.val[2].m64_u64[0] = *(ptr + 2); - v.val[3].m64_u64[0] = *(ptr + 3); - return v; -} - -int8x8x4_t vld4_dup_s8(__transfersize(4) int8_t const * ptr); // VLD4.8 {d0[], d1[], d2[], d3[]}, [r0] -#define vld4_dup_s8(ptr) vld4_dup_u8((uint8_t*)ptr) - -int16x4x4_t vld4_dup_s16(__transfersize(4) int16_t const * ptr); // VLD4.16 {d0[], d1[], d2[], d3[]}, [r0] -#define vld4_dup_s16(ptr) vld4_dup_u16((uint16_t*)ptr) - -int32x2x4_t vld4_dup_s32(__transfersize(4) int32_t const * ptr); // VLD4.32 {d0[], d1[], d2[], d3[]}, [r0] -#define vld4_dup_s32(ptr) vld4_dup_u32((uint32_t*)ptr) - -int64x1x4_t vld4_dup_s64(__transfersize(4) int64_t const * ptr); // VLD1.64 {d0, d1, d2, d3}, [r0] -#define vld4_dup_s64(ptr) vld4_dup_u64((uint64_t*)ptr) - -float16x4x4_t vld4_dup_f16(__transfersize(4) __fp16 const * ptr); // VLD4.16 {d0[], d1[], d2[], d3[]}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers. See vld1q_f16 for example - -float32x2x4_t vld4_dup_f32(__transfersize(4) float32_t const * ptr); // VLD4.32 {d0[], d1[], d2[], d3[]}, [r0] -_NEON2SSE_INLINE float32x2x4_t vld4_dup_f32(__transfersize(4) float32_t const * ptr) // VLD4.32 {d0[], d1[], d2[], d3[]}, [r0] -{ - float32x2x4_t v; - int i; - for (i = 0; i<4; i++) { - v.val[i].m64_f32[0] = *(ptr + i); - v.val[i].m64_f32[1] = *(ptr + i); - } - return v; -} - -poly8x8x4_t vld4_dup_p8(__transfersize(4) poly8_t const * ptr); // VLD4.8 {d0[], d1[], d2[], d3[]}, [r0] -#define vld4_dup_p8 vld4_dup_u8 - -poly16x4x4_t vld4_dup_p16(__transfersize(4) poly16_t const * ptr); // VLD4.16 {d0[], d1[], d2[], d3[]}, [r0] -#define vld4_dup_p16 vld4_dup_u16 - - -//********************************************************************************** -//*******************Lane loads for an N-element structures *********************** -//********************************************************************************** -//********************** Lane pairs ************************************************ -//does vld1_lane_xx ptr[0] to src->val[0] at lane positon and ptr[1] to src->val[1] at lane positon -//we assume src is 16 bit aligned - -//!!!!!! Microsoft compiler does not allow xxxxxx_2t function arguments resulting in "formal parameter with __declspec(align('16')) won't be aligned" error -//to fix it the all functions below work with xxxxxx_2t pointers and the corresponding original functions are redefined - -//uint16x8x2_t vld2q_lane_u16(__transfersize(2) uint16_t const * ptr, uint16x8x2_t src,__constrange(0,7) int lane);// VLD2.16 {d0[0], d2[0]}, [r0] -_NEON2SSE_INLINE uint16x8x2_t vld2q_lane_u16_ptr(__transfersize(2) uint16_t const * ptr, uint16x8x2_t* src,__constrange(0,7) int lane) // VLD2.16 {d0[0], d2[0]}, [r0] -{ - uint16x8x2_t v; - v.val[0] = vld1q_lane_s16 (ptr, src->val[0], lane); - v.val[1] = vld1q_lane_s16 ((ptr + 1), src->val[1], lane); - return v; -} -#define vld2q_lane_u16(ptr, src, lane) vld2q_lane_u16_ptr(ptr, &src, lane) - -//uint32x4x2_t vld2q_lane_u32(__transfersize(2) uint32_t const * ptr, uint32x4x2_t src,__constrange(0,3) int lane);// VLD2.32 {d0[0], d2[0]}, [r0] -_NEON2SSE_INLINE uint32x4x2_t vld2q_lane_u32_ptr(__transfersize(2) uint32_t const * ptr, uint32x4x2_t* src,__constrange(0,3) int lane) // VLD2.32 {d0[0], d2[0]}, [r0] -{ - uint32x4x2_t v; - v.val[0] = _MM_INSERT_EPI32 (src->val[0], ptr[0], lane); - v.val[1] = _MM_INSERT_EPI32 (src->val[1], ptr[1], lane); - return v; -} -#define vld2q_lane_u32(ptr, src, lane) vld2q_lane_u32_ptr(ptr, &src, lane) - -//int16x8x2_t vld2q_lane_s16(__transfersize(2) int16_t const * ptr, int16x8x2_t src, __constrange(0,7)int lane);// VLD2.16 {d0[0], d2[0]}, [r0] -_NEON2SSE_INLINE int16x8x2_t vld2q_lane_s16_ptr(__transfersize(2) int16_t const * ptr, int16x8x2_t* src, __constrange(0,7) int lane) -{ - int16x8x2_t v; - v.val[0] = vld1q_lane_s16 (ptr, src->val[0], lane); - v.val[1] = vld1q_lane_s16 ((ptr + 1), src->val[1], lane); - return v; -} -#define vld2q_lane_s16(ptr, src, lane) vld2q_lane_s16_ptr(ptr, &src, lane) - -//int32x4x2_t vld2q_lane_s32(__transfersize(2) int32_t const * ptr, int32x4x2_t src, __constrange(0,3)int lane);// VLD2.32 {d0[0], d2[0]}, [r0] -_NEON2SSE_INLINE int32x4x2_t vld2q_lane_s32_ptr(__transfersize(2) int32_t const * ptr, int32x4x2_t* src, __constrange(0,3) int lane) -{ - int32x4x2_t v; - v.val[0] = _MM_INSERT_EPI32 (src->val[0], ptr[0], lane); - v.val[1] = _MM_INSERT_EPI32 (src->val[1], ptr[1], lane); - return v; -} -#define vld2q_lane_s32(ptr, src, lane) vld2q_lane_s32_ptr(ptr, &src, lane) - -//float16x8x2_t vld2q_lane_f16(__transfersize(2) __fp16 const * ptr, float16x8x2_t src, __constrange(0,7)int lane);// VLD2.16 {d0[0], d2[0]}, [r0] -//current IA SIMD doesn't support float16 - -//float32x4x2_t vld2q_lane_f32_ptr(__transfersize(2) float32_t const * ptr, float32x4x2_t src,__constrange(0,3) int lane);// VLD2.32 {d0[0], d2[0]}, [r0] -_NEON2SSE_INLINE float32x4x2_t vld2q_lane_f32_ptr(__transfersize(2) float32_t const * ptr, float32x4x2_t* src,__constrange(0,3) int lane) // VLD2.32 {d0[0], d2[0]}, [r0] -{ - float32x4x2_t v; - v.val[0] = vld1q_lane_f32(ptr, src->val[0], lane); - v.val[1] = vld1q_lane_f32((ptr + 1), src->val[1], lane); - return v; -} -#define vld2q_lane_f32(ptr,src,lane) vld2q_lane_f32_ptr(ptr,&src,lane) - -//poly16x8x2_t vld2q_lane_p16(__transfersize(2) poly16_t const * ptr, poly16x8x2_t src,__constrange(0,7) int lane);// VLD2.16 {d0[0], d2[0]}, [r0] -#define vld2q_lane_p16 vld2q_lane_u16 - -//uint8x8x2_t vld2_lane_u8(__transfersize(2) uint8_t const * ptr, uint8x8x2_t src, __constrange(0,7) int lane);// VLD2.8 {d0[0], d1[0]}, [r0] -_NEON2SSE_INLINE uint8x8x2_t vld2_lane_u8_ptr(__transfersize(2) uint8_t const * ptr, uint8x8x2_t* src, __constrange(0,7) int lane) // VLD2.8 {d0[0], d1[0]}, [r0] -{ - uint8x8x2_t v; - v.val[0] = vld1_lane_u8(ptr, src->val[0], lane); - v.val[1] = vld1_lane_u8((ptr + 1), src->val[1], lane); - return v; -} -#define vld2_lane_u8(ptr, src, lane) vld2_lane_u8_ptr(ptr, &src, lane) - -//uint16x4x2_t vld2_lane_u16(__transfersize(2) uint16_t const * ptr, uint16x4x2_t src, __constrange(0,3)int lane);// VLD2.16 {d0[0], d1[0]}, [r0] -_NEON2SSE_INLINE uint16x4x2_t vld2_lane_u16_ptr(__transfersize(2) uint16_t const * ptr, uint16x4x2_t* src, __constrange(0,3) int lane) -{ - uint16x4x2_t v; - v.val[0] = vld1_lane_u16(ptr, src->val[0], lane); - v.val[1] = vld1_lane_u16((ptr + 1), src->val[1], lane); - return v; -} -#define vld2_lane_u16(ptr, src, lane) vld2_lane_u16_ptr(ptr, &src, lane) - -//uint32x2x2_t vld2_lane_u32(__transfersize(2) uint32_t const * ptr, uint32x2x2_t src, __constrange(0,1)int lane);// VLD2.32 {d0[0], d1[0]}, [r0] -_NEON2SSE_INLINE uint32x2x2_t vld2_lane_u32_ptr(__transfersize(2) uint32_t const * ptr, uint32x2x2_t* src, __constrange(0,1) int lane) -{ - uint32x2x2_t v; - v.val[0] = vld1_lane_u32(ptr, src->val[0], lane); - v.val[1] = vld1_lane_u32((ptr + 1), src->val[1], lane); - return v; -} -#define vld2_lane_u32(ptr, src, lane) vld2_lane_u32_ptr(ptr, &src, lane) - -//int8x8x2_t vld2_lane_s8(__transfersize(2) int8_t const * ptr, int8x8x2_t src, __constrange(0,7) int lane);// VLD2.8 {d0[0], d1[0]}, [r0] -int8x8x2_t vld2_lane_s8_ptr(__transfersize(2) int8_t const * ptr, int8x8x2_t * src, __constrange(0,7) int lane); // VLD2.8 {d0[0], d1[0]}, [r0] -#define vld2_lane_s8(ptr, src, lane) vld2_lane_u8(( uint8_t*) ptr, src, lane) - -//int16x4x2_t vld2_lane_s16(__transfersize(2) int16_t const * ptr, int16x4x2_t src, __constrange(0,3) int lane);// VLD2.16 {d0[0], d1[0]}, [r0] -int16x4x2_t vld2_lane_s16_ptr(__transfersize(2) int16_t const * ptr, int16x4x2_t * src, __constrange(0,3) int lane); // VLD2.16 {d0[0], d1[0]}, [r0] -#define vld2_lane_s16(ptr, src, lane) vld2_lane_u16(( uint16_t*) ptr, src, lane) - -//int32x2x2_t vld2_lane_s32(__transfersize(2) int32_t const * ptr, int32x2x2_t src, __constrange(0,1) int lane);// VLD2.32 {d0[0], d1[0]}, [r0] -int32x2x2_t vld2_lane_s32_ptr(__transfersize(2) int32_t const * ptr, int32x2x2_t * src, __constrange(0,1) int lane); // VLD2.32 {d0[0], d1[0]}, [r0] -#define vld2_lane_s32(ptr, src, lane) vld2_lane_u32(( uint32_t*) ptr, src, lane) - -//float16x4x2_t vld2_lane_f16(__transfersize(2) __fp16 const * ptr, float16x4x2_t src, __constrange(0,3) int lane); // VLD2.16 {d0[0], d1[0]}, [r0] -//current IA SIMD doesn't support float16 - -float32x2x2_t vld2_lane_f32_ptr(__transfersize(2) float32_t const * ptr, float32x2x2_t * src,__constrange(0,1) int lane); // VLD2.32 {d0[0], d1[0]}, [r0] -_NEON2SSE_INLINE float32x2x2_t vld2_lane_f32_ptr(__transfersize(2) float32_t const * ptr, float32x2x2_t * src,__constrange(0,1) int lane) -{ - float32x2x2_t v; - v.val[0] = vld1_lane_f32(ptr, src->val[0], lane); - v.val[1] = vld1_lane_f32((ptr + 1), src->val[1], lane); - return v; -} -#define vld2_lane_f32(ptr, src, lane) vld2_lane_f32_ptr(ptr, &src, lane) - -//poly8x8x2_t vld2_lane_p8(__transfersize(2) poly8_t const * ptr, poly8x8x2_t src, __constrange(0,7) int lane);// VLD2.8 {d0[0], d1[0]}, [r0] -poly8x8x2_t vld2_lane_p8_ptr(__transfersize(2) poly8_t const * ptr, poly8x8x2_t * src, __constrange(0,7) int lane); // VLD2.8 {d0[0], d1[0]}, [r0] -#define vld2_lane_p8 vld2_lane_u8 - -//poly16x4x2_t vld2_lane_p16(__transfersize(2) poly16_t const * ptr, poly16x4x2_t src, __constrange(0,3)int lane);// VLD2.16 {d0[0], d1[0]}, [r0] -poly16x4x2_t vld2_lane_p16_ptr(__transfersize(2) poly16_t const * ptr, poly16x4x2_t * src, __constrange(0,3) int lane); // VLD2.16 {d0[0], d1[0]}, [r0] -#define vld2_lane_p16 vld2_lane_u16 - -//*********** Lane triplets ********************** -//************************************************* -//does vld1_lane_xx ptr[0] to src->val[0], ptr[1] to src->val[1] and ptr[2] to src->val[2] at lane positon -//we assume src is 16 bit aligned - -//uint16x8x3_t vld3q_lane_u16(__transfersize(3) uint16_t const * ptr, uint16x8x3_t src,__constrange(0,7) int lane);// VLD3.16 {d0[0], d2[0], d4[0]}, [r0] -_NEON2SSE_INLINE uint16x8x3_t vld3q_lane_u16_ptr(__transfersize(3) uint16_t const * ptr, uint16x8x3_t* src,__constrange(0,7) int lane) // VLD3.16 {d0[0], d2[0], d4[0]}, [r0] -{ - uint16x8x3_t v; - v.val[0] = _MM_INSERT_EPI16 ( src->val[0], ptr[0], lane); - v.val[1] = _MM_INSERT_EPI16 ( src->val[1], ptr[1], lane); - v.val[2] = _MM_INSERT_EPI16 ( src->val[2], ptr[2], lane); - return v; -} -#define vld3q_lane_u16(ptr, src, lane) vld3q_lane_u16_ptr(ptr, &src, lane) - -//uint32x4x3_t vld3q_lane_u32(__transfersize(3) uint32_t const * ptr, uint32x4x3_t src,__constrange(0,3) int lane);// VLD3.32 {d0[0], d2[0], d4[0]}, [r0] -_NEON2SSE_INLINE uint32x4x3_t vld3q_lane_u32_ptr(__transfersize(3) uint32_t const * ptr, uint32x4x3_t* src,__constrange(0,3) int lane) // VLD3.32 {d0[0], d2[0], d4[0]}, [r0] -{ - uint32x4x3_t v; - v.val[0] = _MM_INSERT_EPI32 ( src->val[0], ptr[0], lane); - v.val[1] = _MM_INSERT_EPI32 ( src->val[1], ptr[1], lane); - v.val[2] = _MM_INSERT_EPI32 ( src->val[2], ptr[2], lane); - return v; -} -#define vld3q_lane_u32(ptr, src, lane) vld3q_lane_u32_ptr(ptr, &src, lane) - -//int16x8x3_t vld3q_lane_s16(__transfersize(3) int16_t const * ptr, int16x8x3_t src, __constrange(0,7)int lane);// VLD3.16 {d0[0], d2[0], d4[0]}, [r0] -_NEON2SSE_INLINE int16x8x3_t vld3q_lane_s16_ptr(__transfersize(3) int16_t const * ptr, int16x8x3_t* src, __constrange(0,7) int lane) // VLD3.16 {d0[0], d2[0], d4[0]}, [r0] -{ - int16x8x3_t v; - v.val[0] = _MM_INSERT_EPI16 ( src->val[0], ptr[0], lane); - v.val[1] = _MM_INSERT_EPI16 ( src->val[1], ptr[1], lane); - v.val[2] = _MM_INSERT_EPI16 ( src->val[2], ptr[2], lane); - return v; -} -#define vld3q_lane_s16(ptr, src, lane) vld3q_lane_s16_ptr(ptr, &src, lane) - -//int32x4x3_t vld3q_lane_s32(__transfersize(3) int32_t const * ptr, int32x4x3_t src, __constrange(0,3)int lane);// VLD3.32 {d0[0], d2[0], d4[0]}, [r0] -_NEON2SSE_INLINE int32x4x3_t vld3q_lane_s32_ptr(__transfersize(3) int32_t const * ptr, int32x4x3_t* src, __constrange(0,3) int lane) // VLD3.32 {d0[0], d2[0], d4[0]}, [r0] -{ - int32x4x3_t v; - v.val[0] = _MM_INSERT_EPI32 ( src->val[0], ptr[0], lane); - v.val[1] = _MM_INSERT_EPI32 ( src->val[1], ptr[1], lane); - v.val[2] = _MM_INSERT_EPI32 ( src->val[2], ptr[2], lane); - return v; -} -#define vld3q_lane_s32(ptr, src, lane) vld3q_lane_s32_ptr(ptr, &src, lane) - -float16x8x3_t vld3q_lane_f16_ptr(__transfersize(3) __fp16 const * ptr, float16x8x3_t * src, __constrange(0,7) int lane); // VLD3.16 {d0[0], d2[0], d4[0]}, [r0] -//current IA SIMD doesn't support float16 -#define vld3q_lane_f16(ptr, src, lane) vld3q_lane_f16_ptr(ptr, &src, lane) - - -//float32x4x3_t vld3q_lane_f32(__transfersize(3) float32_t const * ptr, float32x4x3_t src,__constrange(0,3) int lane);// VLD3.32 {d0[0], d2[0], d4[0]}, [r0] -_NEON2SSE_INLINE float32x4x3_t vld3q_lane_f32_ptr(__transfersize(3) float32_t const * ptr, float32x4x3_t* src,__constrange(0,3) int lane) // VLD3.32 {d0[0], d2[0], d4[0]}, [r0] -{ - float32x4x3_t v; - v.val[0] = vld1q_lane_f32(&ptr[0], src->val[0], lane); - v.val[1] = vld1q_lane_f32(&ptr[1], src->val[1], lane); - v.val[2] = vld1q_lane_f32(&ptr[2], src->val[2], lane); - return v; -} -#define vld3q_lane_f32(ptr,src,lane) vld3q_lane_f32_ptr(ptr,&src,lane) - -poly16x8x3_t vld3q_lane_p16_ptr(__transfersize(3) poly16_t const * ptr, poly16x8x3_t * src,__constrange(0,7) int lane); // VLD3.16 {d0[0], d2[0], d4[0]}, [r0] -#define vld3q_lane_p16 vld3q_lane_u16 - -//uint8x8x3_t vld3_lane_u8(__transfersize(3) uint8_t const * ptr, uint8x8x3_t src, __constrange(0,7) int lane);// VLD3.8 {d0[0], d1[0], d2[0]}, [r0] -_NEON2SSE_INLINE uint8x8x3_t vld3_lane_u8_ptr(__transfersize(3) uint8_t const * ptr, uint8x8x3_t* src, __constrange(0,7) int lane) // VLD3.8 {d0[0], d1[0], d2[0]}, [r0] -{ - uint8x8x3_t v; - v.val[0] = vld1_lane_u8(ptr, src->val[0], lane); - v.val[1] = vld1_lane_u8((ptr + 1), src->val[1], lane); - v.val[2] = vld1_lane_u8((ptr + 2), src->val[2], lane); - return v; -} -#define vld3_lane_u8(ptr, src, lane) vld3_lane_u8_ptr(ptr, &src, lane) - -//uint16x4x3_t vld3_lane_u16(__transfersize(3) uint16_t const * ptr, uint16x4x3_t src, __constrange(0,3)int lane);// VLD3.16 {d0[0], d1[0], d2[0]}, [r0] -_NEON2SSE_INLINE uint16x4x3_t vld3_lane_u16_ptr(__transfersize(3) uint16_t const * ptr, uint16x4x3_t* src, __constrange(0,3) int lane) // VLD3.16 {d0[0], d1[0], d2[0]}, [r0] -{ - uint16x4x3_t v; - v.val[0] = vld1_lane_u16(ptr, src->val[0], lane); - v.val[1] = vld1_lane_u16((ptr + 1), src->val[1], lane); - v.val[2] = vld1_lane_u16((ptr + 2), src->val[2], lane); - return v; -} -#define vld3_lane_u16(ptr, src, lane) vld3_lane_u16_ptr(ptr, &src, lane) - -//uint32x2x3_t vld3_lane_u32(__transfersize(3) uint32_t const * ptr, uint32x2x3_t src, __constrange(0,1)int lane);// VLD3.32 {d0[0], d1[0], d2[0]}, [r0] -_NEON2SSE_INLINE uint32x2x3_t vld3_lane_u32_ptr(__transfersize(3) uint32_t const * ptr, uint32x2x3_t* src, __constrange(0,1) int lane) // VLD3.32 {d0[0], d1[0], d2[0]}, [r0] -{ - //need to merge into 128 bit anyway - uint32x2x3_t v; - v.val[0] = vld1_lane_u32(ptr, src->val[0], lane);; - v.val[1] = vld1_lane_u32((ptr + 1), src->val[1], lane);; - v.val[2] = vld1_lane_u32((ptr + 2), src->val[2], lane);; - return v; -} -#define vld3_lane_u32(ptr, src, lane) vld3_lane_u32_ptr(ptr, &src, lane) - -int8x8x3_t vld3_lane_s8_ptr(__transfersize(3) int8_t const * ptr, int8x8x3_t * src, __constrange(0,7) int lane); // VLD3.8 {d0[0], d1[0], d2[0]}, [r0] -#define vld3_lane_s8(ptr, src, lane) vld3_lane_u8_ptr(( uint8_t*) ptr, &src, lane) - -int16x4x3_t vld3_lane_s16_ptr(__transfersize(3) int16_t const * ptr, int16x4x3_t * src, __constrange(0,3) int lane); // VLD3.16 {d0[0], d1[0], d2[0]}, [r0] -#define vld3_lane_s16(ptr, src, lane) vld3_lane_u16_ptr(( uint16_t*) ptr, &src, lane) - -int32x2x3_t vld3_lane_s32_ptr(__transfersize(3) int32_t const * ptr, int32x2x3_t * src, __constrange(0,1) int lane); // VLD3.32 {d0[0], d1[0], d2[0]}, [r0] -#define vld3_lane_s32(ptr, src, lane) vld3_lane_u32_ptr(( uint32_t*) ptr, &src, lane) - -float16x4x3_t vld3_lane_f16_ptr(__transfersize(3) __fp16 const * ptr, float16x4x3_t * src, __constrange(0,3) int lane); // VLD3.16 {d0[0], d1[0], d2[0]}, [r0] -//current IA SIMD doesn't support float16 - -//float32x2x3_t vld3_lane_f32(__transfersize(3) float32_t const * ptr, float32x2x3_t src,__constrange(0,1) int lane);// VLD3.32 {d0[0], d1[0], d2[0]}, [r0] -_NEON2SSE_INLINE float32x2x3_t vld3_lane_f32_ptr(__transfersize(3) float32_t const * ptr, float32x2x3_t* src,__constrange(0,1) int lane) // VLD3.32 {d0[0], d1[0], d2[0]}, [r0] -{ - float32x2x3_t v; - v.val[0] = vld1_lane_f32(ptr, src->val[0], lane); - v.val[1] = vld1_lane_f32((ptr + 1), src->val[1], lane); - v.val[2] = vld1_lane_f32((ptr + 2), src->val[2], lane); - return v; -} -#define vld3_lane_f32(ptr,src,lane) vld3_lane_f32_ptr(ptr,&src,lane) - -//poly8x8x3_t vld3_lane_p8_ptr(__transfersize(3) poly8_t const * ptr, poly8x8x3_t * src, __constrange(0,7) int lane); // VLD3.8 {d0[0], d1[0], d2[0]}, [r0] -#define vld3_lane_p8 vld3_lane_u8 - -//poly16x4x3_t vld3_lane_p16(__transfersize(3) poly16_t const * ptr, poly16x4x3_t * src, __constrange(0,3) int lane); // VLD3.16 {d0[0], d1[0], d2[0]}, [r0] -#define vld3_lane_p16 vld3_lane_u16 - -//******************* Lane Quadruples load *************************** -//********************************************************************* -//does vld1_lane_xx ptr[0] to src->val[0], ptr[1] to src->val[1], ptr[2] to src->val[2] and ptr[3] to src->val[3] at lane positon -//we assume src is 16 bit aligned - -//uint16x8x4_t vld4q_lane_u16(__transfersize(4) uint16_t const * ptr, uint16x8x4_t src,__constrange(0,7) int lane)// VLD4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -_NEON2SSE_INLINE uint16x8x4_t vld4q_lane_u16_ptr(__transfersize(4) uint16_t const * ptr, uint16x8x4_t* src,__constrange(0,7) int lane) -{ - uint16x8x4_t v; - v.val[0] = _MM_INSERT_EPI16 ( src->val[0], ptr[0], lane); - v.val[1] = _MM_INSERT_EPI16 ( src->val[1], ptr[1], lane); - v.val[2] = _MM_INSERT_EPI16 ( src->val[2], ptr[2], lane); - v.val[3] = _MM_INSERT_EPI16 ( src->val[3], ptr[3], lane); - return v; -} -#define vld4q_lane_u16(ptr, src, lane) vld4q_lane_u16_ptr(ptr, &src, lane) - -//uint32x4x4_t vld4q_lane_u32(__transfersize(4) uint32_t const * ptr, uint32x4x4_t src,__constrange(0,3) int lane)// VLD4.32 {d0[0], d2[0], d4[0], d6[0]}, [r0] -_NEON2SSE_INLINE uint32x4x4_t vld4q_lane_u32_ptr(__transfersize(4) uint32_t const * ptr, uint32x4x4_t* src,__constrange(0,3) int lane) -{ - uint32x4x4_t v; - v.val[0] = _MM_INSERT_EPI32 ( src->val[0], ptr[0], lane); - v.val[1] = _MM_INSERT_EPI32 ( src->val[1], ptr[1], lane); - v.val[2] = _MM_INSERT_EPI32 ( src->val[2], ptr[2], lane); - v.val[3] = _MM_INSERT_EPI32 ( src->val[3], ptr[3], lane); - return v; -} -#define vld4q_lane_u32(ptr, src, lane) vld4q_lane_u32_ptr(ptr, &src, lane) - -//int16x8x4_t vld4q_lane_s16(__transfersize(4) int16_t const * ptr, int16x8x4_t src, __constrange(0,7)int lane);// VLD4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -int16x8x4_t vld4q_lane_s16_ptr(__transfersize(4) int16_t const * ptr, int16x8x4_t * src, __constrange(0,7) int lane); // VLD4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -#define vld4q_lane_s16(ptr, src, lane) vld4q_lane_u16(( uint16_t*) ptr, src, lane) - -//int32x4x4_t vld4q_lane_s32(__transfersize(4) int32_t const * ptr, int32x4x4_t src, __constrange(0,3)int lane);// VLD4.32 {d0[0], d2[0], d4[0], d6[0]}, [r0] -int32x4x4_t vld4q_lane_s32_ptr(__transfersize(4) int32_t const * ptr, int32x4x4_t * src, __constrange(0,3) int lane); // VLD4.32 {d0[0], d2[0], d4[0], d6[0]}, [r0] -#define vld4q_lane_s32(ptr, src, lane) vld4q_lane_u32(( uint32_t*) ptr, src, lane) - -//float16x8x4_t vld4q_lane_f16(__transfersize(4) __fp16 const * ptr, float16x8x4_t src, __constrange(0,7)int lane);// VLD4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -float16x8x4_t vld4q_lane_f16_ptr(__transfersize(4) __fp16 const * ptr, float16x8x4_t * src, __constrange(0,7) int lane); // VLD4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -//current IA SIMD doesn't support float16 - -//float32x4x4_t vld4q_lane_f32(__transfersize(4) float32_t const * ptr, float32x4x4_t src,__constrange(0,3) int lane)// VLD4.32 {d0[0], d2[0], d4[0], d6[0]}, [r0] -_NEON2SSE_INLINE float32x4x4_t vld4q_lane_f32_ptr(__transfersize(4) float32_t const * ptr, float32x4x4_t* src,__constrange(0,3) int lane) -{ - float32x4x4_t v; - v.val[0] = vld1q_lane_f32(&ptr[0], src->val[0], lane); - v.val[1] = vld1q_lane_f32(&ptr[1], src->val[1], lane); - v.val[2] = vld1q_lane_f32(&ptr[2], src->val[2], lane); - v.val[3] = vld1q_lane_f32(&ptr[3], src->val[3], lane); - return v; -} -#define vld4q_lane_f32(ptr,val,lane) vld4q_lane_f32_ptr(ptr,&val,lane) - -//poly16x8x4_t vld4q_lane_p16(__transfersize(4) poly16_t const * ptr, poly16x8x4_t src,__constrange(0,7) int lane);// VLD4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -poly16x8x4_t vld4q_lane_p16_ptr(__transfersize(4) poly16_t const * ptr, poly16x8x4_t * src,__constrange(0,7) int lane); // VLD4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -#define vld4q_lane_p16 vld4q_lane_u16 - -//uint8x8x4_t vld4_lane_u8(__transfersize(4) uint8_t const * ptr, uint8x8x4_t src, __constrange(0,7) int lane)// VLD4.8 {d0[0], d1[0], d2[0], d3[0]}, [r0] -_NEON2SSE_INLINE uint8x8x4_t vld4_lane_u8_ptr(__transfersize(4) uint8_t const * ptr, uint8x8x4_t* src, __constrange(0,7) int lane) -{ - uint8x8x4_t v; - v.val[0] = vld1_lane_u8(ptr, src->val[0], lane); - v.val[1] = vld1_lane_u8((ptr + 1), src->val[1], lane); - v.val[2] = vld1_lane_u8((ptr + 2), src->val[2], lane); - v.val[3] = vld1_lane_u8((ptr + 3), src->val[3], lane); - return v; -} -#define vld4_lane_u8(ptr, src, lane) vld4_lane_u8_ptr(ptr, &src, lane) - -//uint16x4x4_t vld4_lane_u16(__transfersize(4) uint16_t const * ptr, uint16x4x4_t src, __constrange(0,3)int lane)// VLD4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -_NEON2SSE_INLINE uint16x4x4_t vld4_lane_u16_ptr(__transfersize(4) uint16_t const * ptr, uint16x4x4_t* src, __constrange(0,3) int lane) -{ - uint16x4x4_t v; - v.val[0] = vld1_lane_u16(ptr, src->val[0], lane); - v.val[1] = vld1_lane_u16((ptr + 1), src->val[1], lane); - v.val[2] = vld1_lane_u16((ptr + 2), src->val[2], lane); - v.val[3] = vld1_lane_u16((ptr + 3), src->val[3], lane); - return v; -} -#define vld4_lane_u16(ptr, src, lane) vld4_lane_u16_ptr(ptr, &src, lane) - -//uint32x2x4_t vld4_lane_u32(__transfersize(4) uint32_t const * ptr, uint32x2x4_t src, __constrange(0,1)int lane)// VLD4.32 {d0[0], d1[0], d2[0], d3[0]}, [r0] -_NEON2SSE_INLINE uint32x2x4_t vld4_lane_u32_ptr(__transfersize(4) uint32_t const * ptr, uint32x2x4_t* src, __constrange(0,1) int lane) -{ - uint32x2x4_t v; - v.val[0] = vld1_lane_u32(ptr, src->val[0], lane); - v.val[1] = vld1_lane_u32((ptr + 1), src->val[1], lane); - v.val[2] = vld1_lane_u32((ptr + 2), src->val[2], lane); - v.val[3] = vld1_lane_u32((ptr + 3), src->val[3], lane); - return v; -} -#define vld4_lane_u32(ptr, src, lane) vld4_lane_u32_ptr(ptr, &src, lane) - -//int8x8x4_t vld4_lane_s8(__transfersize(4) int8_t const * ptr, int8x8x4_t src, __constrange(0,7) int lane);// VLD4.8 {d0[0], d1[0], d2[0], d3[0]}, [r0] -int8x8x4_t vld4_lane_s8_ptr(__transfersize(4) int8_t const * ptr, int8x8x4_t * src, __constrange(0,7) int lane); -#define vld4_lane_s8(ptr,src,lane) vld4_lane_u8((uint8_t*)ptr,src,lane) - -//int16x4x4_t vld4_lane_s16(__transfersize(4) int16_t const * ptr, int16x4x4_t src, __constrange(0,3) int lane);// VLD4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -int16x4x4_t vld4_lane_s16_ptr(__transfersize(4) int16_t const * ptr, int16x4x4_t * src, __constrange(0,3) int lane); -#define vld4_lane_s16(ptr,src,lane) vld4_lane_u16((uint16_t*)ptr,src,lane) - -//int32x2x4_t vld4_lane_s32(__transfersize(4) int32_t const * ptr, int32x2x4_t src, __constrange(0,1) int lane);// VLD4.32 {d0[0], d1[0], d2[0], d3[0]}, [r0] -int32x2x4_t vld4_lane_s32_ptr(__transfersize(4) int32_t const * ptr, int32x2x4_t * src, __constrange(0,1) int lane); -#define vld4_lane_s32(ptr,src,lane) vld4_lane_u32((uint32_t*)ptr,src,lane) - -//float16x4x4_t vld4_lane_f16(__transfersize(4) __fp16 const * ptr, float16x4x4_t src, __constrange(0,3)int lane);// VLD4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -float16x4x4_t vld4_lane_f16_ptr(__transfersize(4) __fp16 const * ptr, float16x4x4_t * src, __constrange(0,3) int lane); -//current IA SIMD doesn't support float16 - -//float32x2x4_t vld4_lane_f32(__transfersize(4) float32_t const * ptr, float32x2x4_t src,__constrange(0,1) int lane)// VLD4.32 {d0[0], d1[0], d2[0], d3[0]}, [r0] -_NEON2SSE_INLINE float32x2x4_t vld4_lane_f32_ptr(__transfersize(4) float32_t const * ptr, float32x2x4_t* src,__constrange(0,1) int lane) -{ - //serial solution may be faster - float32x2x4_t v; - v.val[0] = vld1_lane_f32(ptr, src->val[0], lane); - v.val[1] = vld1_lane_f32((ptr + 1), src->val[1], lane); - v.val[2] = vld1_lane_f32((ptr + 2), src->val[2], lane); - v.val[3] = vld1_lane_f32((ptr + 3), src->val[3], lane); - return v; -} -#define vld4_lane_f32(ptr,src,lane) vld4_lane_f32_ptr(ptr,&src,lane) - -//poly8x8x4_t vld4_lane_p8(__transfersize(4) poly8_t const * ptr, poly8x8x4_t src, __constrange(0,7) int lane);// VLD4.8 {d0[0], d1[0], d2[0], d3[0]}, [r0] -poly8x8x4_t vld4_lane_p8_ptr(__transfersize(4) poly8_t const * ptr, poly8x8x4_t * src, __constrange(0,7) int lane); -#define vld4_lane_p8 vld4_lane_u8 - -//poly16x4x4_t vld4_lane_p16(__transfersize(4) poly16_t const * ptr, poly16x4x4_t src, __constrange(0,3)int lane);// VLD4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -poly16x4x4_t vld4_lane_p16_ptr(__transfersize(4) poly16_t const * ptr, poly16x4x4_t * src, __constrange(0,3) int lane); -#define vld4_lane_p16 vld4_lane_u16 - -//******************* Store duplets ********************************************* -//******************************************************************************** -//here we assume the ptr is 16bit aligned. If not we need to use _mm_storeu_si128 like shown in vst1q_u8 function -//If necessary you need to modify all store functions accordingly. See more comments to "Store single" functions -//void vst2q_u8(__transfersize(32) uint8_t * ptr, uint8x16x2_t val)// VST2.8 {d0, d2}, [r0] -_NEON2SSE_INLINE void vst2q_u8_ptr(__transfersize(32) uint8_t * ptr, uint8x16x2_t* val) -{ - uint8x16x2_t v; - v.val[0] = _mm_unpacklo_epi8(val->val[0], val->val[1]); - v.val[1] = _mm_unpackhi_epi8(val->val[0], val->val[1]); - vst1q_u8 (ptr, v.val[0]); - vst1q_u8 ((ptr + 16), v.val[1]); -} -#define vst2q_u8(ptr, val) vst2q_u8_ptr(ptr, &val) - -//void vst2q_u16(__transfersize(16) uint16_t * ptr, uint16x8x2_t val)// VST2.16 {d0, d2}, [r0] -_NEON2SSE_INLINE void vst2q_u16_ptr(__transfersize(16) uint16_t * ptr, uint16x8x2_t* val) -{ - uint16x8x2_t v; - v.val[0] = _mm_unpacklo_epi16(val->val[0], val->val[1]); - v.val[1] = _mm_unpackhi_epi16(val->val[0], val->val[1]); - vst1q_u16 (ptr, v.val[0]); - vst1q_u16 ((ptr + 8), v.val[1]); -} -#define vst2q_u16(ptr, val) vst2q_u16_ptr(ptr, &val) - -//void vst2q_u32(__transfersize(8) uint32_t * ptr, uint32x4x2_t val)// VST2.32 {d0, d2}, [r0] -_NEON2SSE_INLINE void vst2q_u32_ptr(__transfersize(8) uint32_t* ptr, uint32x4x2_t* val) -{ - uint32x4x2_t v; - v.val[0] = _mm_unpacklo_epi32(val->val[0], val->val[1]); - v.val[1] = _mm_unpackhi_epi32(val->val[0], val->val[1]); - vst1q_u32 (ptr, v.val[0]); - vst1q_u32 ((ptr + 4), v.val[1]); -} -#define vst2q_u32(ptr, val) vst2q_u32_ptr(ptr, &val) - -//void vst2q_s8(__transfersize(32) int8_t * ptr, int8x16x2_t val); // VST2.8 {d0, d2}, [r0] -void vst2q_s8_ptr(__transfersize(32) int8_t * ptr, int8x16x2_t * val); -#define vst2q_s8(ptr, val) vst2q_u8((uint8_t*)(ptr), val) - -//void vst2q_s16(__transfersize(16) int16_t * ptr, int16x8x2_t val);// VST2.16 {d0, d2}, [r0] -void vst2q_s16_ptr(__transfersize(16) int16_t * ptr, int16x8x2_t * val); -#define vst2q_s16(ptr, val) vst2q_u16((uint16_t*)(ptr), val) - -//void vst2q_s32(__transfersize(8) int32_t * ptr, int32x4x2_t val);// VST2.32 {d0, d2}, [r0] -void vst2q_s32_ptr(__transfersize(8) int32_t * ptr, int32x4x2_t * val); -#define vst2q_s32(ptr, val) vst2q_u32((uint32_t*)(ptr), val) - -//void vst2q_f16(__transfersize(16) __fp16 * ptr, float16x8x2_t val);// VST2.16 {d0, d2}, [r0] -void vst2q_f16_ptr(__transfersize(16) __fp16 * ptr, float16x8x2_t * val); -// IA32 SIMD doesn't work with 16bit floats currently - -//void vst2q_f32(__transfersize(8) float32_t * ptr, float32x4x2_t val)// VST2.32 {d0, d2}, [r0] -_NEON2SSE_INLINE void vst2q_f32_ptr(__transfersize(8) float32_t* ptr, float32x4x2_t* val) -{ - float32x4x2_t v; - v.val[0] = _mm_unpacklo_ps(val->val[0], val->val[1]); - v.val[1] = _mm_unpackhi_ps(val->val[0], val->val[1]); - vst1q_f32 (ptr, v.val[0]); - vst1q_f32 ((ptr + 4), v.val[1]); -} -#define vst2q_f32(ptr, val) vst2q_f32_ptr(ptr, &val) - -//void vst2q_p8(__transfersize(32) poly8_t * ptr, poly8x16x2_t val);// VST2.8 {d0, d2}, [r0] -void vst2q_p8_ptr(__transfersize(32) poly8_t * ptr, poly8x16x2_t * val); -#define vst2q_p8 vst2q_u8 - -//void vst2q_p16(__transfersize(16) poly16_t * ptr, poly16x8x2_t val);// VST2.16 {d0, d2}, [r0] -void vst2q_p16_ptr(__transfersize(16) poly16_t * ptr, poly16x8x2_t * val); -#define vst2q_p16 vst2q_u16 - -//void vst2_u8(__transfersize(16) uint8_t * ptr, uint8x8x2_t val);// VST2.8 {d0, d1}, [r0] -_NEON2SSE_INLINE void vst2_u8_ptr(__transfersize(16) uint8_t * ptr, uint8x8x2_t* val) -{ - __m128i v0; - v0 = _mm_unpacklo_epi8(_pM128i(val->val[0]), _pM128i(val->val[1])); - vst1q_u8 (ptr, v0); -} -#define vst2_u8(ptr, val) vst2_u8_ptr(ptr, &val) - -//void vst2_u16(__transfersize(8) uint16_t * ptr, uint16x4x2_t val);// VST2.16 {d0, d1}, [r0] -_NEON2SSE_INLINE void vst2_u16_ptr(__transfersize(8) uint16_t * ptr, uint16x4x2_t* val) -{ - __m128i v0; - v0 = _mm_unpacklo_epi16(_pM128i(val->val[0]), _pM128i(val->val[1])); - vst1q_u16 (ptr, v0); -} -#define vst2_u16(ptr, val) vst2_u16_ptr(ptr, &val) - -//void vst2_u32(__transfersize(4) uint32_t * ptr, uint32x2x2_t val);// VST2.32 {d0, d1}, [r0] -_NEON2SSE_INLINE void vst2_u32_ptr(__transfersize(4) uint32_t * ptr, uint32x2x2_t* val) -{ - __m128i v0; - v0 = _mm_unpacklo_epi32(_pM128i(val->val[0]), _pM128i(val->val[1])); - vst1q_u32 (ptr, v0); -} -#define vst2_u32(ptr, val) vst2_u32_ptr(ptr, &val) - - -//void vst2_u64(__transfersize(2) uint64_t * ptr, uint64x1x2_t val);// VST1.64 {d0, d1}, [r0] -void vst2_u64_ptr(__transfersize(2) uint64_t * ptr, uint64x1x2_t * val); -_NEON2SSE_INLINE void vst2_u64_ptr(__transfersize(2) uint64_t * ptr, uint64x1x2_t* val) -{ - *(ptr) = val->val[0].m64_u64[0]; - *(ptr + 1) = val->val[1].m64_u64[0]; -} -#define vst2_u64(ptr, val) vst2_u64_ptr(ptr, &val) - -//void vst2_s8(__transfersize(16) int8_t * ptr, int8x8x2_t val);// VST2.8 {d0, d1}, [r0] -#define vst2_s8(ptr, val) vst2_u8((uint8_t*) ptr, val) - -//void vst2_s16(__transfersize(8) int16_t * ptr, int16x4x2_t val); // VST2.16 {d0, d1}, [r0] -#define vst2_s16(ptr,val) vst2_u16((uint16_t*) ptr, val) - -//void vst2_s32(__transfersize(4) int32_t * ptr, int32x2x2_t val); // VST2.32 {d0, d1}, [r0] -#define vst2_s32(ptr,val) vst2_u32((uint32_t*) ptr, val) - -//void vst2_s64(__transfersize(2) int64_t * ptr, int64x1x2_t val); -#define vst2_s64(ptr,val) vst2_u64((uint64_t*) ptr,val) - -//void vst2_f16(__transfersize(8) __fp16 * ptr, float16x4x2_t val); // VST2.16 {d0, d1}, [r0] -//current IA SIMD doesn't support float16 - -//void vst2_f32(__transfersize(4) float32_t * ptr, float32x2x2_t val); // VST2.32 {d0, d1}, [r0] -_NEON2SSE_INLINE void vst2_f32_ptr(__transfersize(4) float32_t* ptr, float32x2x2_t* val) -{ - *(ptr) = val->val[0].m64_f32[0]; - *(ptr + 1) = val->val[1].m64_f32[0]; - *(ptr + 2) = val->val[0].m64_f32[1]; - *(ptr + 3) = val->val[1].m64_f32[1]; -} -#define vst2_f32(ptr, val) vst2_f32_ptr(ptr, &val) - -//void vst2_p8_ptr(__transfersize(16) poly8_t * ptr, poly8x8x2_t * val); // VST2.8 {d0, d1}, [r0] -#define vst2_p8 vst2_u8 - -//void vst2_p16_ptr(__transfersize(8) poly16_t * ptr, poly16x4x2_t * val); // VST2.16 {d0, d1}, [r0] -#define vst2_p16 vst2_u16 - -//******************** Triplets store ***************************************** -//****************************************************************************** -//void vst3q_u8(__transfersize(48) uint8_t * ptr, uint8x16x3_t val)// VST3.8 {d0, d2, d4}, [r0] -_NEON2SSE_INLINE void vst3q_u8_ptr(__transfersize(48) uint8_t * ptr, uint8x16x3_t* val) -{ - uint8x16x3_t v; - __m128i v0,v1,v2, cff, bldmask; - _NEON2SSE_ALIGN_16 uint8_t mask0[16] = {0, 1, 0xff, 2, 3,0xff, 4, 5,0xff, 6,7,0xff, 8,9,0xff, 10}; - _NEON2SSE_ALIGN_16 uint8_t mask1[16] = {0, 0xff, 1, 2, 0xff, 3, 4, 0xff, 5, 6, 0xff, 7,8,0xff, 9,10}; - _NEON2SSE_ALIGN_16 uint8_t mask2[16] = {0xff, 6, 7, 0xff, 8, 9,0xff, 10, 11,0xff, 12,13,0xff, 14,15,0xff}; - _NEON2SSE_ALIGN_16 uint8_t mask2lo[16] = {0xff,0xff, 0, 0xff,0xff, 1, 0xff,0xff, 2, 0xff,0xff, 3, 0xff,0xff, 4, 0xff}; - _NEON2SSE_ALIGN_16 uint8_t mask2med[16] = {0xff, 5, 0xff, 0xff, 6, 0xff,0xff, 7, 0xff,0xff, 8, 0xff,0xff, 9, 0xff, 0xff}; - _NEON2SSE_ALIGN_16 uint8_t mask2hi[16] = {10, 0xff,0xff, 11, 0xff,0xff, 12, 0xff,0xff, 13, 0xff,0xff, 14, 0xff, 0xff, 15}; - - v0 = _mm_unpacklo_epi8(val->val[0], val->val[1]); //0,1, 3,4, 6,7, 9,10, 12,13, 15,16, 18,19, 21,22 - v2 = _mm_unpackhi_epi8(val->val[0], val->val[1]); //24,25, 27,28, 30,31, 33,34, 36,37, 39,40, 42,43, 45,46 - v1 = _mm_alignr_epi8(v2, v0, 11); //12,13, 15,16, 18,19, 21,22, 24,25, 27,28, 30,31, 33,34 - v.val[0] = _mm_shuffle_epi8(v0, *(__m128i*)mask0); //make holes for the v.val[2] data embedding - v.val[2] = _mm_shuffle_epi8(val->val[2], *(__m128i*)mask2lo); //make plugs for the v.val[2] data embedding - cff = _mm_cmpeq_epi8(v0, v0); //all ff - bldmask = _mm_cmpeq_epi8(*(__m128i*)mask0, cff); - v.val[0] = _MM_BLENDV_EPI8(v.val[0], v.val[2], bldmask); - vst1q_u8(ptr, v.val[0]); - v.val[0] = _mm_shuffle_epi8(v1, *(__m128i*)mask1); //make holes for the v.val[2] data embedding - v.val[2] = _mm_shuffle_epi8(val->val[2], *(__m128i*)mask2med); //make plugs for the v.val[2] data embedding - bldmask = _mm_cmpeq_epi8(*(__m128i*)mask1, cff); - v.val[1] = _MM_BLENDV_EPI8(v.val[0],v.val[2], bldmask); - vst1q_u8((ptr + 16), v.val[1]); - v.val[0] = _mm_shuffle_epi8(v2, *(__m128i*)mask2); //make holes for the v.val[2] data embedding - v.val[2] = _mm_shuffle_epi8(val->val[2], *(__m128i*)mask2hi); //make plugs for the v.val[2] data embedding - bldmask = _mm_cmpeq_epi8(*(__m128i*)mask2, cff); - v.val[2] = _MM_BLENDV_EPI8(v.val[0],v.val[2], bldmask ); - vst1q_u8((ptr + 32), v.val[2]); -} -#define vst3q_u8(ptr, val) vst3q_u8_ptr(ptr, &val) - -//void vst3q_u16(__transfersize(24) uint16_t * ptr, uint16x8x3_t val)// VST3.16 {d0, d2, d4}, [r0] -_NEON2SSE_INLINE void vst3q_u16_ptr(__transfersize(24) uint16_t * ptr, uint16x8x3_t* val) -{ - uint16x8x3_t v; - __m128i v0,v1,v2, cff, bldmask; - _NEON2SSE_ALIGN_16 uint8_t mask0[16] = {0,1, 2,3, 0xff,0xff, 4,5, 6,7,0xff,0xff, 8,9,10,11}; - _NEON2SSE_ALIGN_16 uint8_t mask1[16] = {0xff, 0xff, 0,1, 2,3, 0xff,0xff, 4,5, 6,7, 0xff,0xff, 8,9}; - _NEON2SSE_ALIGN_16 uint8_t mask2[16] = {6,7,0xff,0xff, 8,9,10,11, 0xff, 0xff, 12,13,14,15, 0xff, 0xff}; - _NEON2SSE_ALIGN_16 uint8_t mask2lo[16] = {0xff,0xff, 0xff,0xff, 0,1, 0xff,0xff, 0xff,0xff, 2,3, 0xff,0xff, 0xff,0xff}; - _NEON2SSE_ALIGN_16 uint8_t mask2med[16] = {4,5, 0xff,0xff,0xff,0xff, 6,7, 0xff, 0xff,0xff,0xff, 8,9, 0xff, 0xff}; - _NEON2SSE_ALIGN_16 uint8_t mask2hi[16] = {0xff, 0xff, 10,11, 0xff, 0xff, 0xff, 0xff, 12,13, 0xff, 0xff, 0xff, 0xff,14,15}; - - v0 = _mm_unpacklo_epi16(val->val[0], val->val[1]); //0,1, 3,4, 6,7, 9,10 - v2 = _mm_unpackhi_epi16(val->val[0], val->val[1]); //12,13, 15,16, 18,19, 21,22, - v1 = _mm_alignr_epi8(v2, v0, 12); //9,10, 12,13, 15,16, 18,19 - v.val[0] = _mm_shuffle_epi8(v0, *(__m128i*)mask0); //make holes for the v.val[2] data embedding - v.val[2] = _mm_shuffle_epi8(val->val[2], *(__m128i*)mask2lo); //make plugs for the v.val[2] data embedding - cff = _mm_cmpeq_epi16(v0, v0); //all ff - bldmask = _mm_cmpeq_epi16(*(__m128i*)mask0, cff); - v.val[0] = _MM_BLENDV_EPI8(v.val[0], v.val[2], bldmask); - vst1q_u16(ptr, v.val[0]); - v.val[0] = _mm_shuffle_epi8(v1, *(__m128i*)mask1); //make holes for the v.val[2] data embedding - v.val[2] = _mm_shuffle_epi8(val->val[2], *(__m128i*)mask2med); //make plugs for the v.val[2] data embedding - bldmask = _mm_cmpeq_epi16(*(__m128i*)mask1, cff); - v.val[1] = _MM_BLENDV_EPI8(v.val[0],v.val[2], bldmask); - vst1q_u16((ptr + 8), v.val[1]); - v.val[0] = _mm_shuffle_epi8(v2, *(__m128i*)mask2); //make holes for the v.val[2] data embedding - v.val[2] = _mm_shuffle_epi8(val->val[2], *(__m128i*)mask2hi); //make plugs for the v.val[2] data embedding - bldmask = _mm_cmpeq_epi16(*(__m128i*)mask2, cff); - v.val[2] = _MM_BLENDV_EPI8(v.val[0],v.val[2], bldmask ); - vst1q_u16((ptr + 16), v.val[2]); -} -#define vst3q_u16(ptr, val) vst3q_u16_ptr(ptr, &val) - -//void vst3q_u32(__transfersize(12) uint32_t * ptr, uint32x4x3_t val)// VST3.32 {d0, d2, d4}, [r0] -_NEON2SSE_INLINE void vst3q_u32_ptr(__transfersize(12) uint32_t * ptr, uint32x4x3_t* val) -{ - //a0,a1,a2,a3, b0,b1,b2,b3, c0,c1,c2,c3 -> a0,b0,c0,a1, b1,c1,a2,b2, c2,a3,b3,c3 - uint32x4x3_t v; - __m128i tmp0, tmp1,tmp2; - tmp0 = _mm_unpacklo_epi32(val->val[0], val->val[1]); //a0,b0,a1,b1 - tmp1 = _mm_unpackhi_epi32(val->val[0], val->val[1]); //a2,b2,a3,b3 - tmp2 = _mm_unpacklo_epi32(val->val[1], val->val[2]); //b0,c0,b1,c1 - v.val[1] = _mm_castps_si128(_mm_shuffle_ps(_mm_castsi128_ps(tmp2),_mm_castsi128_ps(tmp1), _MM_SHUFFLE(1,0,3,2))); //b1,c1,a2,b2, - v.val[2] = _mm_unpackhi_epi64(tmp1, val->val[2]); //a3,b3, c2,c3 - v.val[2] = _mm_shuffle_epi32(v.val[2], 2 | (0 << 2) | (1 << 4) | (3 << 6)); //c2,a3,b3,c3 - tmp1 = _mm_unpacklo_epi32(tmp2,val->val[0]); //b0,a0,c0,a1 - v.val[0] = _mm_castps_si128(_mm_shuffle_ps(_mm_castsi128_ps(tmp0),_mm_castsi128_ps(tmp1), _MM_SHUFFLE(3,2,1,0))); //a0,b0,c0,a1, - - vst1q_u32(ptr, v.val[0]); - vst1q_u32((ptr + 4), v.val[1]); - vst1q_u32((ptr + 8), v.val[2]); -} -#define vst3q_u32(ptr, val) vst3q_u32_ptr(ptr, &val) - -//void vst3q_s8(__transfersize(48) int8_t * ptr, int8x16x3_t val); -void vst3q_s8_ptr(__transfersize(48) int8_t * ptr, int8x16x3_t * val); -#define vst3q_s8(ptr, val) vst3q_u8((uint8_t*)(ptr), val) - -//void vst3q_s16(__transfersize(24) int16_t * ptr, int16x8x3_t val); -void vst3q_s16_ptr(__transfersize(24) int16_t * ptr, int16x8x3_t * val); -#define vst3q_s16(ptr, val) vst3q_u16((uint16_t*)(ptr), val) - -//void vst3q_s32(__transfersize(12) int32_t * ptr, int32x4x3_t val); -void vst3q_s32_ptr(__transfersize(12) int32_t * ptr, int32x4x3_t * val); -#define vst3q_s32(ptr, val) vst3q_u32((uint32_t*)(ptr), val) - -//void vst3q_f16(__transfersize(24) __fp16 * ptr, float16x8x3_t val);// VST3.16 {d0, d2, d4}, [r0] -void vst3q_f16_ptr(__transfersize(24) __fp16 * ptr, float16x8x3_t * val); -// IA32 SIMD doesn't work with 16bit floats currently - -//void vst3q_f32(__transfersize(12) float32_t * ptr, float32x4x3_t val)// VST3.32 {d0, d2, d4}, [r0] -_NEON2SSE_INLINE void vst3q_f32_ptr(__transfersize(12) float32_t * ptr, float32x4x3_t* val) -{ - float32x4x3_t v; - __m128 tmp0, tmp1,tmp2; - tmp0 = _mm_unpacklo_ps(val->val[0], val->val[1]); //a0,b0,a1,b1 - tmp1 = _mm_unpackhi_ps(val->val[0], val->val[1]); //a2,b2,a3,b3 - tmp2 = _mm_unpacklo_ps(val->val[1], val->val[2]); //b0,c0,b1,c1 - v.val[1] = _mm_shuffle_ps(tmp2,tmp1, _MM_SHUFFLE(1,0,3,2)); //b1,c1,a2,b2, - v.val[2] = _mm_movehl_ps(val->val[2],tmp1); //a3,b3, c2,c3 - v.val[2] = _mm_shuffle_ps(v.val[2],v.val[2], _MM_SHUFFLE(3,1,0,2)); //c2,a3,b3,c3 - tmp1 = _mm_unpacklo_ps(tmp2,val->val[0]); //b0,a0,c0,a1 - v.val[0] = _mm_shuffle_ps(tmp0,tmp1, _MM_SHUFFLE(3,2,1,0)); //a0,b0,c0,a1, - - vst1q_f32( ptr, v.val[0]); - vst1q_f32( (ptr + 4), v.val[1]); - vst1q_f32( (ptr + 8), v.val[2]); -} -#define vst3q_f32(ptr, val) vst3q_f32_ptr(ptr, &val) - -//void vst3q_p8(__transfersize(48) poly8_t * ptr, poly8x16x3_t val);// VST3.8 {d0, d2, d4}, [r0] -void vst3q_p8_ptr(__transfersize(48) poly8_t * ptr, poly8x16x3_t * val); -#define vst3q_p8 vst3q_u8 - -//void vst3q_p16(__transfersize(24) poly16_t * ptr, poly16x8x3_t val);// VST3.16 {d0, d2, d4}, [r0] -void vst3q_p16_ptr(__transfersize(24) poly16_t * ptr, poly16x8x3_t * val); -#define vst3q_p16 vst3q_u16 - -//void vst3_u8(__transfersize(24) uint8_t * ptr, uint8x8x3_t val)// VST3.8 {d0, d1, d2}, [r0] -_NEON2SSE_INLINE void vst3_u8_ptr(__transfersize(24) uint8_t * ptr, uint8x8x3_t* val) -{ - __m128i tmp, sh0, sh1, val0, val2; - _NEON2SSE_ALIGN_16 int8_t mask0[16] = { 0, 8, 16, 1, 9, 17, 2, 10, 18, 3, 11, 19, 4, 12, 20, 5}; - _NEON2SSE_ALIGN_16 int8_t mask1[16] = {13, 21, 6, 14, 22, 7, 15, 23, 0,0,0,0,0,0,0,0}; - _NEON2SSE_ALIGN_16 int8_t mask0_sel[16] = {0, 0, 0xff, 0, 0, 0xff, 0, 0, 0xff, 0, 0, 0xff, 0, 0, 0xff, 0}; - _NEON2SSE_ALIGN_16 int8_t mask1_sel[16] = {0, 0xff, 0, 0, 0xff, 0, 0, 0xff, 0,0,0,0,0,0,0,0}; - tmp = _mm_unpacklo_epi64(_pM128i(val->val[0]), _pM128i(val->val[1]) ); - sh0 = _mm_shuffle_epi8(tmp, *(__m128i*)mask0); //for bi>15 bi is wrapped (bi-=15) - val2 = _pM128i(val->val[2]); - sh1 = _mm_shuffle_epi8(val2, *(__m128i*)mask0); - val0 = _MM_BLENDV_EPI8(sh0, sh1, *(__m128i*)mask0_sel); - vst1q_u8(ptr, val0); //store as 128 bit structure - sh0 = _mm_shuffle_epi8(tmp, *(__m128i*)mask1); //for bi>15 bi is wrapped (bi-=15) - sh1 = _mm_shuffle_epi8(val2, *(__m128i*)mask1); - val2 = _MM_BLENDV_EPI8(sh0, sh1, *(__m128i*)mask1_sel); - _M64((*(__m64_128*)(ptr + 16)), val2); //need it to fit into *ptr memory -} -#define vst3_u8(ptr, val) vst3_u8_ptr(ptr, &val) - -//void vst3_u16(__transfersize(12) uint16_t * ptr, uint16x4x3_t val)// VST3.16 {d0, d1, d2}, [r0] -_NEON2SSE_INLINE void vst3_u16_ptr(__transfersize(12) uint16_t * ptr, uint16x4x3_t* val) -{ - __m128i tmp, val0, val1, val2; - _NEON2SSE_ALIGN_16 int8_t mask0[16] = {0,1, 8,9, 16,17, 2,3, 10,11, 18,19, 4,5, 12,13}; - _NEON2SSE_ALIGN_16 int8_t mask1[16] = {20,21, 6,7, 14,15, 22,23, 0,0,0,0,0,0,0,0}; - _NEON2SSE_ALIGN_16 uint16_t mask0f[8] = {0xffff, 0xffff, 0, 0xffff, 0xffff, 0, 0xffff, 0xffff}; //if all ones we take the result from v.val[0] otherwise from v.val[1] - _NEON2SSE_ALIGN_16 uint16_t mask1f[8] = {0xffff, 0, 0, 0xffff, 0xffff, 0xffff, 0xffff, 0xffff}; //if all ones we take the result from v.val[1] otherwise from v.val[0] - tmp = _mm_unpacklo_epi64(_pM128i(val->val[0]), _pM128i(val->val[1])); - val0 = _mm_shuffle_epi8(tmp, *(__m128i*)mask0); - val2 = _pM128i(val->val[2]); - val1 = _mm_shuffle_epi8(val2, *(__m128i*)mask0); - val0 = _MM_BLENDV_EPI8(val1, val0, *(__m128i*)mask0f); - vst1q_u16(ptr, val0); //store as 128 bit structure - val0 = _mm_shuffle_epi8(tmp, *(__m128i*)mask1); - val1 = _mm_shuffle_epi8(val2, *(__m128i*)mask1); - val1 = _MM_BLENDV_EPI8(val0, val1, *(__m128i*)mask1f); //change the operands order - _M64((*(__m64_128*)(ptr + 8)), val1); //need it to fit into *ptr memory -} -#define vst3_u16(ptr, val) vst3_u16_ptr(ptr, &val) - -//void vst3_u32(__transfersize(6) uint32_t * ptr, uint32x2x3_t val)// VST3.32 {d0, d1, d2}, [r0] -_NEON2SSE_INLINE void vst3_u32_ptr(__transfersize(6) uint32_t * ptr, uint32x2x3_t* val) -{ - //val->val[0]:0,3,val->val[1]:1,4; val->val[2]:2,5,x,x; - __m128i val0, val1; - val0 = _mm_unpacklo_epi64(_pM128i(val->val[1]), _pM128i(val->val[2])); //val[0]: 1,4,2,5 - val0 = _mm_shuffle_epi32(val0, 0 | (2 << 2) | (1 << 4) | (3 << 6)); //1,2,4,5 - val1 = _mm_srli_si128(val0, 8); //4,5, x,x - _M64((*(__m64_128*)(ptr + 4)), val1); - val0 = _mm_unpacklo_epi32(_pM128i(val->val[0]), val0); //0,1,3,2 - val0 = _mm_shuffle_epi32(val0, 0 | (1 << 2) | (3 << 4) | (2 << 6)); //0,1,2, 3 - vst1q_u32(ptr, val0); //store as 128 bit structure -} -#define vst3_u32(ptr, val) vst3_u32_ptr(ptr, &val) - -//void vst3_u64(__transfersize(3) uint64_t * ptr, uint64x1x3_t val)// VST1.64 {d0, d1, d2}, [r0] -_NEON2SSE_INLINE void vst3_u64_ptr(__transfersize(3) uint64_t * ptr, uint64x1x3_t* val) -{ - *(ptr) = val->val[0].m64_u64[0]; - *(ptr + 1) = val->val[1].m64_u64[0]; - *(ptr + 2) = val->val[2].m64_u64[0]; -} -#define vst3_u64(ptr, val) vst3_u64_ptr(ptr, &val) - -//void vst3_s8(__transfersize(24) int8_t * ptr, int8x8x3_t val) // VST3.8 {d0, d1, d2}, [r0] -#define vst3_s8(ptr, val) vst3_u8_ptr((uint8_t*)ptr, &val) - -//void vst3_s16(__transfersize(12) int16_t * ptr, int16x4x3_t val) // VST3.16 {d0, d1, d2}, [r0] -#define vst3_s16(ptr, val) vst3_u16_ptr((uint16_t*)ptr, &val) - -//void vst3_s32(__transfersize(6) int32_t * ptr, int32x2x3_t val); // VST3.32 {d0, d1, d2}, [r0] -#define vst3_s32(ptr, val) vst3_u32_ptr((uint32_t*)ptr, &val) - -//void vst3_s64(__transfersize(3) int64_t * ptr, int64x1x3_t val) // VST1.64 {d0, d1, d2}, [r0] -#define vst3_s64(ptr, val) vst3_u64_ptr((uint64_t*)ptr, &val) - -//void vst3_f16(__transfersize(12) __fp16 * ptr, float16x4x3_t val);// VST3.16 {d0, d1, d2}, [r0] -void vst3_f16_ptr(__transfersize(12) __fp16 * ptr, float16x4x3_t * val); // VST3.16 {d0, d1, d2}, [r0] -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers. See vld1q_f16 for example - -//void vst3_f32(__transfersize(6) float32_t * ptr, float32x2x3_t val)// VST3.32 {d0, d1, d2}, [r0] -_NEON2SSE_INLINE void vst3_f32_ptr(__transfersize(6) float32_t * ptr, float32x2x3_t* val) -{ - //val->val[0]:0,3,val->val[1]:1,4; val->val[2]:2,5,x,x; -> 0,2, 4,1, 3,5 - *(ptr) = val->val[0].m64_f32[0]; - *(ptr + 1) = val->val[1].m64_f32[0]; - *(ptr + 2) = val->val[2].m64_f32[0]; - *(ptr + 3) = val->val[0].m64_f32[1]; - *(ptr + 4) = val->val[1].m64_f32[1]; - *(ptr + 5) = val->val[2].m64_f32[1]; -} -#define vst3_f32(ptr, val) vst3_f32_ptr(ptr, &val) - -//void vst3_p8(__transfersize(24) poly8_t * ptr, poly8x8x3_t val);// VST3.8 {d0, d1, d2}, [r0] -void vst3_p8_ptr(__transfersize(24) poly8_t * ptr, poly8x8x3_t * val); -#define vst3_p8 vst3_u8 - -//void vst3_p16(__transfersize(12) poly16_t * ptr, poly16x4x3_t val);// VST3.16 {d0, d1, d2}, [r0] -void vst3_p16_ptr(__transfersize(12) poly16_t * ptr, poly16x4x3_t * val); -#define vst3_p16 vst3_s16 - -//*************** Quadruples store ******************************** -//********************************************************************* -//void vst4q_u8(__transfersize(64) uint8_t * ptr, uint8x16x4_t val)// VST4.8 {d0, d2, d4, d6}, [r0] -_NEON2SSE_INLINE void vst4q_u8_ptr(__transfersize(64) uint8_t * ptr, uint8x16x4_t* val) -{ - __m128i tmp1, tmp2, res; - tmp1 = _mm_unpacklo_epi8(val->val[0], val->val[1]); // 0,1, 4,5, 8,9, 12,13, 16,17, 20,21, 24,25, 28,29 - tmp2 = _mm_unpacklo_epi8(val->val[2], val->val[3]); // 2,3, 6,7, 10,11, 14,15, 18,19, 22,23, 26,27, 30,31 - res = _mm_unpacklo_epi16(tmp1, tmp2); //0,1, 2,3, 4,5, 6,7, 8,9, 10,11, 12,13, 14,15 - vst1q_u8(ptr, res); - res = _mm_unpackhi_epi16(tmp1, tmp2); //16,17, 18,19, 20,21, 22,23, 24,25, 26,27, 28,29, 30,31 - vst1q_u8((ptr + 16), res); - tmp1 = _mm_unpackhi_epi8(val->val[0], val->val[1]); // - tmp2 = _mm_unpackhi_epi8(val->val[2], val->val[3]); // - res = _mm_unpacklo_epi16(tmp1, tmp2); // - vst1q_u8((ptr + 32), res); - res = _mm_unpackhi_epi16(tmp1, tmp2); // - vst1q_u8((ptr + 48), res); -} -#define vst4q_u8(ptr, val) vst4q_u8_ptr(ptr, &val) - -//void vst4q_u16(__transfersize(32) uint16_t * ptr, uint16x8x4_t val)// VST4.16 {d0, d2, d4, d6}, [r0] -_NEON2SSE_INLINE void vst4q_u16_ptr(__transfersize(32) uint16_t * ptr, uint16x8x4_t* val) -{ - uint16x8x4_t v; - __m128i tmp1, tmp2; - tmp1 = _mm_unpacklo_epi16(val->val[0], val->val[1]); //0,1, 4,5, 8,9, 12,13 - tmp2 = _mm_unpacklo_epi16(val->val[2], val->val[3]); //2,3, 6,7 , 10,11, 14,15 - v.val[0] = _mm_unpacklo_epi32(tmp1, tmp2); - v.val[1] = _mm_unpackhi_epi32(tmp1, tmp2); - tmp1 = _mm_unpackhi_epi16(val->val[0], val->val[1]); //0,1, 4,5, 8,9, 12,13 - tmp2 = _mm_unpackhi_epi16(val->val[2], val->val[3]); //2,3, 6,7 , 10,11, 14,15 - v.val[2] = _mm_unpacklo_epi32(tmp1, tmp2); - v.val[3] = _mm_unpackhi_epi32(tmp1, tmp2); - vst1q_u16(ptr, v.val[0]); - vst1q_u16((ptr + 8), v.val[1]); - vst1q_u16((ptr + 16),v.val[2]); - vst1q_u16((ptr + 24), v.val[3]); -} -#define vst4q_u16(ptr, val) vst4q_u16_ptr(ptr, &val) - -//void vst4q_u32(__transfersize(16) uint32_t * ptr, uint32x4x4_t val)// VST4.32 {d0, d2, d4, d6}, [r0] -_NEON2SSE_INLINE void vst4q_u32_ptr(__transfersize(16) uint32_t * ptr, uint32x4x4_t* val) -{ - uint16x8x4_t v; - __m128i tmp1, tmp2; - tmp1 = _mm_unpacklo_epi32(val->val[0], val->val[1]); //0,1, 4,5, 8,9, 12,13 - tmp2 = _mm_unpacklo_epi32(val->val[2], val->val[3]); //2,3, 6,7 , 10,11, 14,15 - v.val[0] = _mm_unpacklo_epi64(tmp1, tmp2); - v.val[1] = _mm_unpackhi_epi64(tmp1, tmp2); - tmp1 = _mm_unpackhi_epi32(val->val[0], val->val[1]); //0,1, 4,5, 8,9, 12,13 - tmp2 = _mm_unpackhi_epi32(val->val[2], val->val[3]); //2,3, 6,7 , 10,11, 14,15 - v.val[2] = _mm_unpacklo_epi64(tmp1, tmp2); - v.val[3] = _mm_unpackhi_epi64(tmp1, tmp2); - vst1q_u32(ptr, v.val[0]); - vst1q_u32((ptr + 4), v.val[1]); - vst1q_u32((ptr + 8), v.val[2]); - vst1q_u32((ptr + 12), v.val[3]); -} -#define vst4q_u32(ptr, val) vst4q_u32_ptr(ptr, &val) - -//void vst4q_s8(__transfersize(64) int8_t * ptr, int8x16x4_t val); -void vst4q_s8_ptr(__transfersize(64) int8_t * ptr, int8x16x4_t * val); -#define vst4q_s8(ptr, val) vst4q_u8((uint8_t*)(ptr), val) - -//void vst4q_s16(__transfersize(32) int16_t * ptr, int16x8x4_t val); -void vst4q_s16_ptr(__transfersize(32) int16_t * ptr, int16x8x4_t * val); -#define vst4q_s16(ptr, val) vst4q_u16((uint16_t*)(ptr), val) - -//void vst4q_s32(__transfersize(16) int32_t * ptr, int32x4x4_t val); -void vst4q_s32_ptr(__transfersize(16) int32_t * ptr, int32x4x4_t * val); -#define vst4q_s32(ptr, val) vst4q_u32((uint32_t*)(ptr), val) - -//void vst4q_f16(__transfersize(32) __fp16 * ptr, float16x8x4_t val);// VST4.16 {d0, d2, d4, d6}, [r0] -void vst4q_f16_ptr(__transfersize(32) __fp16 * ptr, float16x8x4_t * val); -// IA32 SIMD doesn't work with 16bit floats currently - -//void vst4q_f32(__transfersize(16) float32_t * ptr, float32x4x4_t val)// VST4.32 {d0, d2, d4, d6}, [r0] -_NEON2SSE_INLINE void vst4q_f32_ptr(__transfersize(16) float32_t * ptr, float32x4x4_t* val) -{ - __m128 tmp3, tmp2, tmp1, tmp0; - float32x4x4_t v; - tmp0 = _mm_unpacklo_ps(val->val[0], val->val[1]); - tmp2 = _mm_unpacklo_ps(val->val[2], val->val[3]); - tmp1 = _mm_unpackhi_ps(val->val[0], val->val[1]); - tmp3 = _mm_unpackhi_ps(val->val[2], val->val[3]); - v.val[0] = _mm_movelh_ps(tmp0, tmp2); - v.val[1] = _mm_movehl_ps(tmp2, tmp0); - v.val[2] = _mm_movelh_ps(tmp1, tmp3); - v.val[3] = _mm_movehl_ps(tmp3, tmp1); - vst1q_f32(ptr, v.val[0]); - vst1q_f32((ptr + 4), v.val[1]); - vst1q_f32((ptr + 8), v.val[2]); - vst1q_f32((ptr + 12), v.val[3]); -} -#define vst4q_f32(ptr, val) vst4q_f32_ptr(ptr, &val) - -//void vst4q_p8(__transfersize(64) poly8_t * ptr, poly8x16x4_t val);// VST4.8 {d0, d2, d4, d6}, [r0] -void vst4q_p8_ptr(__transfersize(64) poly8_t * ptr, poly8x16x4_t * val); -#define vst4q_p8 vst4q_u8 - -//void vst4q_p16(__transfersize(32) poly16_t * ptr, poly16x8x4_t val);// VST4.16 {d0, d2, d4, d6}, [r0] -void vst4q_p16_ptr(__transfersize(32) poly16_t * ptr, poly16x8x4_t * val); -#define vst4q_p16 vst4q_s16 - -//void vst4_u8(__transfersize(32) uint8_t * ptr, uint8x8x4_t val)// VST4.8 {d0, d1, d2, d3}, [r0] -_NEON2SSE_INLINE void vst4_u8_ptr(__transfersize(32) uint8_t * ptr, uint8x8x4_t* val) -{ - __m128i sh0, sh1, val0, val2; - sh0 = _mm_unpacklo_epi8(_pM128i(val->val[0]),_pM128i(val->val[1])); // a0,b0,a1,b1,a2,b2,a3,b3,a4,b4,a5,b5, a6,b6,a7,b7, - sh1 = _mm_unpacklo_epi8(_pM128i(val->val[2]),_pM128i(val->val[3])); // c0,d0,c1,d1,c2,d2,c3,d3, c4,d4,c5,d5,c6,d6,c7,d7 - val0 = _mm_unpacklo_epi16(sh0,sh1); // a0,b0,c0,d0,a1,b1,c1,d1,a2,b2,c2,d2,a3,b3,c3,d3, - val2 = _mm_unpackhi_epi16(sh0,sh1); //a4,b4,c4,d4,a5,b5,c5,d5, a6,b6,c6,d6,a7,b7,c7,d7 - vst1q_u8(ptr, val0); - vst1q_u8((ptr + 16), val2); -} -#define vst4_u8(ptr, val) vst4_u8_ptr(ptr, &val) - -//void vst4_u16(__transfersize(16) uint16_t * ptr, uint16x4x4_t val)// VST4.16 {d0, d1, d2, d3}, [r0] -_NEON2SSE_INLINE void vst4_u16_ptr(__transfersize(16) uint16_t * ptr, uint16x4x4_t* val) -{ - __m128i sh0, sh1, val0, val2; - sh0 = _mm_unpacklo_epi16(_pM128i(val->val[0]),_pM128i(val->val[1])); //a0,a1,b0,b1,c0,c1,d0,d1, - sh1 = _mm_unpacklo_epi16(_pM128i(val->val[2]),_pM128i(val->val[3])); //a2,a3,b2,b3,c2,c3,d2,d3 - val0 = _mm_unpacklo_epi32(sh0,sh1); // a0,a1,a2,a3,b0,b1,b2,b3 - val2 = _mm_unpackhi_epi32(sh0,sh1); // c0,c1,c2,c3,d0,d1,d2,d3 - vst1q_u16(ptr, val0); //store as 128 bit structure - vst1q_u16((ptr + 8), val2); -} -#define vst4_u16(ptr, val) vst4_u16_ptr(ptr, &val) - -//void vst4_u32(__transfersize(8) uint32_t * ptr, uint32x2x4_t val)// VST4.32 {d0, d1, d2, d3}, [r0] -_NEON2SSE_INLINE void vst4_u32_ptr(__transfersize(8) uint32_t * ptr, uint32x2x4_t* val) -{ - //0,4, 1,5, 2,6, 3,7 - __m128i sh0, sh1, val0, val1; - sh0 = _mm_unpacklo_epi32(_pM128i(val->val[0]), _pM128i(val->val[1])); //0,1,4,5 - sh1 = _mm_unpacklo_epi32(_pM128i(val->val[2]), _pM128i(val->val[3])); //2,3,6,7 - val0 = _mm_unpacklo_epi64(sh0,sh1); // - val1 = _mm_unpackhi_epi64(sh0,sh1); // - vst1q_u32(ptr, val0); //store as 128 bit structure - vst1q_u32((ptr + 4), val1); -} -#define vst4_u32(ptr, val) vst4_u32_ptr(ptr, &val) - -//void vst4_u64(__transfersize(4) uint64_t * ptr, uint64x1x4_t val)// VST1.64 {d0, d1, d2, d3}, [r0] -_NEON2SSE_INLINE void vst4_u64_ptr(__transfersize(4) uint64_t * ptr, uint64x1x4_t* val) -{ - *(ptr) = val->val[0].m64_u64[0]; - *(ptr + 1) = val->val[1].m64_u64[0]; - *(ptr + 2) = val->val[2].m64_u64[0]; - *(ptr + 3) = val->val[3].m64_u64[0]; -} -#define vst4_u64(ptr, val) vst4_u64_ptr(ptr, &val) - -//void vst4_s8(__transfersize(32) int8_t * ptr, int8x8x4_t val) //VST4.8 {d0, d1, d2, d3}, [r0] -#define vst4_s8(ptr, val) vst4_u8((uint8_t*)ptr, val) - -//void vst4_s16(__transfersize(16) int16_t * ptr, int16x4x4_t val) // VST4.16 {d0, d1, d2, d3}, [r0] -#define vst4_s16(ptr, val) vst4_u16((uint16_t*)ptr, val) - -//void vst4_s32(__transfersize(8) int32_t * ptr, int32x2x4_t val) // VST4.32 {d0, d1, d2, d3}, [r0] -#define vst4_s32(ptr, val) vst4_u32((uint32_t*)ptr, val) - -//void vst4_s64(__transfersize(4) int64_t * ptr, int64x1x4_t val); // VST1.64 {d0, d1, d2, d3}, [r0] -void vst4_s64_ptr(__transfersize(4) int64_t * ptr, int64x1x4_t * val); -#define vst4_s64(ptr, val) vst4_u64((uint64_t*)ptr, val) - -//void vst4_f16(__transfersize(16) __fp16 * ptr, float16x4x4_t val);// VST4.16 {d0, d1, d2, d3}, [r0] -void vst4_f16_ptr(__transfersize(16) __fp16 * ptr, float16x4x4_t * val); -// IA32 SIMD doesn't work with 16bit floats currently, so need to go to 32 bit and then work with two 128bit registers. See vld1q_f16 for example - -//void vst4_f32(__transfersize(8) float32_t * ptr, float32x2x4_t val)// VST4.32 {d0, d1, d2, d3}, [r0] -_NEON2SSE_INLINE void vst4_f32_ptr(__transfersize(8) float32_t * ptr, float32x2x4_t* val) -{ - //0,4, 1,5, 2,6, 3,7 -> 0,1, 2,3, 4,5, 6,7 - *(ptr) = val->val[0].m64_f32[0]; - *(ptr + 1) = val->val[1].m64_f32[0]; - *(ptr + 2) = val->val[2].m64_f32[0]; - *(ptr + 3) = val->val[3].m64_f32[0]; - *(ptr + 4) = val->val[0].m64_f32[1]; - *(ptr + 5) = val->val[1].m64_f32[1]; - *(ptr + 6) = val->val[2].m64_f32[1]; - *(ptr + 7) = val->val[3].m64_f32[1]; -} -#define vst4_f32(ptr, val) vst4_f32_ptr(ptr, &val) - -//void vst4_p8(__transfersize(32) poly8_t * ptr, poly8x8x4_t val);// VST4.8 {d0, d1, d2, d3}, [r0] -void vst4_p8_ptr(__transfersize(32) poly8_t * ptr, poly8x8x4_t * val); -#define vst4_p8 vst4_u8 - -//void vst4_p16(__transfersize(16) poly16_t * ptr, poly16x4x4_t val);// VST4.16 {d0, d1, d2, d3}, [r0] -void vst4_p16_ptr(__transfersize(16) poly16_t * ptr, poly16x4x4_t * val); -#define vst4_p16 vst4_u16 - -//*********** Store a lane of a vector into memory (extract given lane) for a couple of vectors ********************* -//******************************************************************************************************************** -//void vst2q_lane_u16(__transfersize(2) uint16_t * ptr, uint16x8x2_t val, __constrange(0,7) int lane)// VST2.16 {d0[0], d2[0]}, [r0] -_NEON2SSE_INLINE void vst2q_lane_u16_ptr(__transfersize(2) uint16_t * ptr, uint16x8x2_t* val, __constrange(0,7) int lane) -{ - vst1q_lane_s16(ptr, val->val[0], lane); - vst1q_lane_s16((ptr + 1), val->val[1], lane); -} -#define vst2q_lane_u16(ptr, val, lane) vst2q_lane_u16_ptr(ptr, &val, lane) - -//void vst2q_lane_u32(__transfersize(2) uint32_t * ptr, uint32x4x2_t val, __constrange(0,3) int lane)// VST2.32 {d0[0], d2[0]}, [r0] -_NEON2SSE_INLINE void vst2q_lane_u32_ptr(__transfersize(2) uint32_t* ptr, uint32x4x2_t* val, __constrange(0,3) int lane) -{ - vst1q_lane_u32(ptr, val->val[0], lane); - vst1q_lane_u32((ptr + 1), val->val[1], lane); -} -#define vst2q_lane_u32(ptr, val, lane) vst2q_lane_u32_ptr(ptr, &val, lane) - -//void vst2q_lane_s16(__transfersize(2) int16_t * ptr, int16x8x2_t val, __constrange(0,7) int lane);// VST2.16 {d0[0], d2[0]}, [r0] -void vst2q_lane_s16_ptr(__transfersize(2) int16_t * ptr, int16x8x2_t * val, __constrange(0,7) int lane); -#define vst2q_lane_s16(ptr, val, lane) vst2q_lane_u16((uint16_t*)ptr, val, lane) - -//void vst2q_lane_s32(__transfersize(2) int32_t * ptr, int32x4x2_t val, __constrange(0,3) int lane);// VST2.32 {d0[0], d2[0]}, [r0] -void vst2q_lane_s32_ptr(__transfersize(2) int32_t * ptr, int32x4x2_t * val, __constrange(0,3) int lane); -#define vst2q_lane_s32(ptr, val, lane) vst2q_lane_u32((uint32_t*)ptr, val, lane) - -//void vst2q_lane_f16(__transfersize(2) __fp16 * ptr, float16x8x2_t val, __constrange(0,7) int lane);// VST2.16 {d0[0], d2[0]}, [r0] -void vst2q_lane_f16_ptr(__transfersize(2) __fp16 * ptr, float16x8x2_t * val, __constrange(0,7) int lane); -//current IA SIMD doesn't support float16 - -//void vst2q_lane_f32(__transfersize(2) float32_t * ptr, float32x4x2_t val, __constrange(0,3) int lane)// VST2.32 {d0[0], d2[0]}, [r0] -_NEON2SSE_INLINE void vst2q_lane_f32_ptr(__transfersize(2) float32_t* ptr, float32x4x2_t* val, __constrange(0,3) int lane) -{ - vst1q_lane_f32(ptr, val->val[0], lane); - vst1q_lane_f32((ptr + 1), val->val[1], lane); -} -#define vst2q_lane_f32(ptr,src,lane) vst2q_lane_f32_ptr(ptr,&src,lane) - -//void vst2q_lane_p16(__transfersize(2) poly16_t * ptr, poly16x8x2_t val, __constrange(0,7) int lane);// VST2.16 {d0[0], d2[0]}, [r0] -void vst2q_lane_p16_ptr(__transfersize(2) poly16_t * ptr, poly16x8x2_t * val, __constrange(0,7) int lane); -#define vst2q_lane_p16 vst2q_lane_s16 - -//void vst2_lane_u8(__transfersize(2) uint8_t * ptr, uint8x8x2_t val, __constrange(0,7) int lane);// VST2.8 {d0[0], d1[0]}, [r0] -void vst2_lane_u8_ptr(__transfersize(2) uint8_t * ptr, uint8x8x2_t * val, __constrange(0,7) int lane); // VST2.8 {d0[0], d1[0]}, [r0] -_NEON2SSE_INLINE void vst2_lane_u8_ptr(__transfersize(2) uint8_t * ptr, uint8x8x2_t* val, __constrange(0,7) int lane) // VST2.8 {d0[0], d1[0]}, [r0] -{ - *(ptr) = val->val[0].m64_u8[lane]; - *(ptr + 1) = val->val[1].m64_u8[lane]; -} -#define vst2_lane_u8(ptr, val, lane) vst2_lane_u8_ptr(ptr, &val, lane) - -//void vst2_lane_u16(__transfersize(2) uint16_t * ptr, uint16x4x2_t val, __constrange(0,3) int lane);// VST2.16 {d0[0], d1[0]}, [r0] -void vst2_lane_u16_ptr(__transfersize(2) uint16_t * ptr, uint16x4x2_t * val, __constrange(0,3) int lane); // VST2.16 {d0[0], d1[0]}, [r0] -_NEON2SSE_INLINE void vst2_lane_u16_ptr(__transfersize(2) uint16_t * ptr, uint16x4x2_t * val, __constrange(0,3) int lane) -{ - *(ptr) = val->val[0].m64_u16[lane]; - *(ptr + 1) = val->val[1].m64_u16[lane]; -} -#define vst2_lane_u16(ptr, val, lane) vst2_lane_u16_ptr(ptr, &val, lane) - -//void vst2_lane_u32(__transfersize(2) uint32_t * ptr, uint32x2x2_t val, __constrange(0,1) int lane);// VST2.32 {d0[0], d1[0]}, [r0] -void vst2_lane_u32_ptr(__transfersize(2) uint32_t * ptr, uint32x2x2_t * val, __constrange(0,1) int lane); // VST2.32 {d0[0], d1[0]}, [r0] -_NEON2SSE_INLINE void vst2_lane_u32_ptr(__transfersize(2) uint32_t * ptr, uint32x2x2_t * val, __constrange(0,1) int lane) -{ - *(ptr) = val->val[0].m64_u32[lane]; - *(ptr + 1) = val->val[1].m64_u32[lane]; -} -#define vst2_lane_u32(ptr, val, lane) vst2_lane_u32_ptr(ptr, &val, lane) - -//void vst2_lane_s8(__transfersize(2) int8_t * ptr, int8x8x2_t val, __constrange(0,7) int lane);// VST2.8 {d0[0], d1[0]}, [r0] -void vst2_lane_s8_ptr(__transfersize(2) int8_t * ptr, int8x8x2_t * val, __constrange(0,7) int lane); -#define vst2_lane_s8(ptr, val, lane) vst2_lane_u8((uint8_t*)ptr, val, lane) - -//void vst2_lane_s16(__transfersize(2) int16_t * ptr, int16x4x2_t val, __constrange(0,3) int lane);// VST2.16 {d0[0], d1[0]}, [r0] -void vst2_lane_s16_ptr(__transfersize(2) int16_t * ptr, int16x4x2_t * val, __constrange(0,3) int lane); -#define vst2_lane_s16(ptr, val, lane) vst2_lane_u16((uint16_t*)ptr, val, lane) - -//void vst2_lane_s32(__transfersize(2) int32_t * ptr, int32x2x2_t val, __constrange(0,1) int lane);// VST2.32 {d0[0], d1[0]}, [r0] -void vst2_lane_s32_ptr(__transfersize(2) int32_t * ptr, int32x2x2_t * val, __constrange(0,1) int lane); -#define vst2_lane_s32(ptr, val, lane) vst2_lane_u32((uint32_t*)ptr, val, lane) - -//void vst2_lane_f16(__transfersize(2) __fp16 * ptr, float16x4x2_t val, __constrange(0,3) int lane); // VST2.16 {d0[0], d1[0]}, [r0] -//current IA SIMD doesn't support float16 - -void vst2_lane_f32_ptr(__transfersize(2) float32_t * ptr, float32x2x2_t * val, __constrange(0,1) int lane); // VST2.32 {d0[0], d1[0]}, [r0] -_NEON2SSE_INLINE void vst2_lane_f32_ptr(__transfersize(2) float32_t * ptr, float32x2x2_t * val, __constrange(0,1) int lane) -{ - *(ptr) = val->val[0].m64_f32[lane]; - *(ptr + 1) = val->val[1].m64_f32[lane]; -} -#define vst2_lane_f32(ptr,src,lane) vst2_lane_f32_ptr(ptr,&src,lane) - -//void vst2_lane_p8(__transfersize(2) poly8_t * ptr, poly8x8x2_t val, __constrange(0,7) int lane);// VST2.8 {d0[0], d1[0]}, [r0] -#define vst2_lane_p8 vst2_lane_u8 - -//void vst2_lane_p16(__transfersize(2) poly16_t * ptr, poly16x4x2_t val, __constrange(0,3) int lane);// VST2.16 {d0[0], d1[0]}, [r0] -#define vst2_lane_p16 vst2_lane_u16 - -//************************* Triple lanes stores ******************************************************* -//******************************************************************************************************* -//void vst3q_lane_u16(__transfersize(3) uint16_t * ptr, uint16x8x3_t val, __constrange(0,7) int lane)// VST3.16 {d0[0], d2[0], d4[0]}, [r0] -_NEON2SSE_INLINE void vst3q_lane_u16_ptr(__transfersize(3) uint16_t * ptr, uint16x8x3_t* val, __constrange(0,7) int lane) -{ - vst2q_lane_u16_ptr(ptr, (uint16x8x2_t*)val, lane); - vst1q_lane_u16((ptr + 2), val->val[2], lane); -} -#define vst3q_lane_u16(ptr, val, lane) vst3q_lane_u16_ptr(ptr, &val, lane) - -//void vst3q_lane_u32(__transfersize(3) uint32_t * ptr, uint32x4x3_t val, __constrange(0,3) int lane)// VST3.32 {d0[0], d2[0], d4[0]}, [r0] -_NEON2SSE_INLINE void vst3q_lane_u32_ptr(__transfersize(3) uint32_t * ptr, uint32x4x3_t* val, __constrange(0,3) int lane) -{ - vst2q_lane_u32_ptr(ptr, (uint32x4x2_t*)val, lane); - vst1q_lane_u32((ptr + 2), val->val[2], lane); -} -#define vst3q_lane_u32(ptr, val, lane) vst3q_lane_u32_ptr(ptr, &val, lane) - -//void vst3q_lane_s16(__transfersize(3) int16_t * ptr, int16x8x3_t val, __constrange(0,7) int lane);// VST3.16 {d0[0], d2[0], d4[0]}, [r0] -void vst3q_lane_s16_ptr(__transfersize(3) int16_t * ptr, int16x8x3_t * val, __constrange(0,7) int lane); -#define vst3q_lane_s16(ptr, val, lane) vst3q_lane_u16((uint16_t *)ptr, val, lane) - -//void vst3q_lane_s32(__transfersize(3) int32_t * ptr, int32x4x3_t val, __constrange(0,3) int lane);// VST3.32 {d0[0], d2[0], d4[0]}, [r0] -void vst3q_lane_s32_ptr(__transfersize(3) int32_t * ptr, int32x4x3_t * val, __constrange(0,3) int lane); -#define vst3q_lane_s32(ptr, val, lane) vst3q_lane_u32((uint32_t *)ptr, val, lane) - -//void vst3q_lane_f16(__transfersize(3) __fp16 * ptr, float16x8x3_t val, __constrange(0,7) int lane);// VST3.16 {d0[0], d2[0], d4[0]}, [r0] -void vst3q_lane_f16_ptr(__transfersize(3) __fp16 * ptr, float16x8x3_t * val, __constrange(0,7) int lane); -//current IA SIMD doesn't support float16 - -//void vst3q_lane_f32(__transfersize(3) float32_t * ptr, float32x4x3_t val, __constrange(0,3) int lane)// VST3.32 {d0[0], d2[0], d4[0]}, [r0] -_NEON2SSE_INLINE void vst3q_lane_f32_ptr(__transfersize(3) float32_t * ptr, float32x4x3_t* val, __constrange(0,3) int lane) -{ - vst1q_lane_f32(ptr, val->val[0], lane); - vst1q_lane_f32((ptr + 1), val->val[1], lane); - vst1q_lane_f32((ptr + 2), val->val[2], lane); -} -#define vst3q_lane_f32(ptr,val,lane) vst3q_lane_f32_ptr(ptr,&val,lane) - -//void vst3q_lane_p16(__transfersize(3) poly16_t * ptr, poly16x8x3_t val, __constrange(0,7) int lane);// VST3.16 {d0[0], d2[0], d4[0]}, [r0] -void vst3q_lane_p16_ptr(__transfersize(3) poly16_t * ptr, poly16x8x3_t * val, __constrange(0,7) int lane); -#define vst3q_lane_p16 vst3q_lane_s16 - -//void vst3_lane_u8(__transfersize(3) uint8_t * ptr, uint8x8x3_t val, __constrange(0,7) int lane)// VST3.8 {d0[0], d1[0], d2[0]}, [r0] -_NEON2SSE_INLINE void vst3_lane_u8_ptr(__transfersize(3) uint8_t * ptr, uint8x8x3_t* val, __constrange(0,7) int lane) -{ - *(ptr) = val->val[0].m64_u8[lane]; - *(ptr + 1) = val->val[1].m64_u8[lane]; - *(ptr + 2) = val->val[2].m64_u8[lane]; -} -#define vst3_lane_u8(ptr, val, lane) vst3_lane_u8_ptr(ptr, &val, lane) - -//void vst3_lane_u16(__transfersize(3) uint16_t * ptr, uint16x4x3_t val, __constrange(0,3) int lane)// VST3.16 {d0[0], d1[0], d2[0]}, [r0] -_NEON2SSE_INLINE void vst3_lane_u16_ptr(__transfersize(3) uint16_t * ptr, uint16x4x3_t* val, __constrange(0,3) int lane) -{ - *(ptr) = val->val[0].m64_u16[lane]; - *(ptr + 1) = val->val[1].m64_u16[lane]; - *(ptr + 2) = val->val[2].m64_u16[lane]; -} -#define vst3_lane_u16(ptr, val, lane) vst3_lane_u16_ptr(ptr, &val, lane) - -//void vst3_lane_u32(__transfersize(3) uint32_t * ptr, uint32x2x3_t val, __constrange(0,1) int lane)// VST3.32 {d0[0], d1[0], d2[0]}, [r0] -_NEON2SSE_INLINE void vst3_lane_u32_ptr(__transfersize(3) uint32_t * ptr, uint32x2x3_t* val, __constrange(0,1) int lane) -{ - *(ptr) = val->val[0].m64_u32[lane]; - *(ptr + 1) = val->val[1].m64_u32[lane]; - *(ptr + 2) = val->val[2].m64_u32[lane]; -} -#define vst3_lane_u32(ptr, val, lane) vst3_lane_u32_ptr(ptr, &val, lane) - -//void vst3_lane_s8(__transfersize(3) int8_t * ptr, int8x8x3_t val, __constrange(0,7) int lane);// VST3.8 {d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_s8_ptr(__transfersize(3) int8_t * ptr, int8x8x3_t * val, __constrange(0,7) int lane); -#define vst3_lane_s8(ptr, val, lane) vst3_lane_u8((uint8_t *)ptr, val, lane) - -//void vst3_lane_s16(__transfersize(3) int16_t * ptr, int16x4x3_t val, __constrange(0,3) int lane);// VST3.16 {d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_s16_ptr(__transfersize(3) int16_t * ptr, int16x4x3_t * val, __constrange(0,3) int lane); -#define vst3_lane_s16(ptr, val, lane) vst3_lane_u16((uint16_t *)ptr, val, lane) - -//void vst3_lane_s32(__transfersize(3) int32_t * ptr, int32x2x3_t val, __constrange(0,1) int lane);// VST3.32 {d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_s32_ptr(__transfersize(3) int32_t * ptr, int32x2x3_t * val, __constrange(0,1) int lane); -#define vst3_lane_s32(ptr, val, lane) vst3_lane_u32((uint32_t *)ptr, val, lane) - -//void vst3_lane_f16(__transfersize(3) __fp16 * ptr, float16x4x3_t val, __constrange(0,3) int lane);// VST3.16 {d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_f16_ptr(__transfersize(3) __fp16 * ptr, float16x4x3_t * val, __constrange(0,3) int lane); -//current IA SIMD doesn't support float16 - -//void vst3_lane_f32(__transfersize(3) float32_t * ptr, float32x2x3_t val, __constrange(0,1) int lane)// VST3.32 {d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_f32_ptr(__transfersize(3) float32_t * ptr, float32x2x3_t * val, __constrange(0,1) int lane); -_NEON2SSE_INLINE void vst3_lane_f32_ptr(__transfersize(3) float32_t * ptr, float32x2x3_t * val, __constrange(0,1) int lane) -{ - *(ptr) = val->val[0].m64_f32[lane]; - *(ptr + 1) = val->val[1].m64_f32[lane]; - *(ptr + 2) = val->val[2].m64_f32[lane]; -} -#define vst3_lane_f32(ptr,val,lane) vst3_lane_f32_ptr(ptr,&val,lane) - -//void vst3_lane_p8(__transfersize(3) poly8_t * ptr, poly8x8x3_t val, __constrange(0,7) int lane);// VST3.8 {d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_p8_ptr(__transfersize(3) poly8_t * ptr, poly8x8x3_t * val, __constrange(0,7) int lane); -#define vst3_lane_p8 vst3_lane_u8 - -//void vst3_lane_p16(__transfersize(3) poly16_t * ptr, poly16x4x3_t val, __constrange(0,3) int lane);// VST3.16 {d0[0], d1[0], d2[0]}, [r0] -void vst3_lane_p16_ptr(__transfersize(3) poly16_t * ptr, poly16x4x3_t * val, __constrange(0,3) int lane); -#define vst3_lane_p16 vst3_lane_s16 - -//******************************** Quadruple lanes stores *********************************************** -//******************************************************************************************************* -//void vst4q_lane_u16(__transfersize(4) uint16_t * ptr, uint16x8x4_t val, __constrange(0,7) int lane)// VST4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -_NEON2SSE_INLINE void vst4q_lane_u16_ptr(__transfersize(4) uint16_t * ptr, uint16x8x4_t* val4, __constrange(0,7) int lane) -{ - vst2q_lane_u16_ptr(ptr, (uint16x8x2_t*)val4->val, lane); - vst2q_lane_u16_ptr((ptr + 2),((uint16x8x2_t*)val4->val + 1), lane); -} -#define vst4q_lane_u16(ptr, val, lane) vst4q_lane_u16_ptr(ptr, &val, lane) - -//void vst4q_lane_u32(__transfersize(4) uint32_t * ptr, uint32x4x4_t val, __constrange(0,3) int lane)// VST4.32 {d0[0], d2[0], d4[0], d6[0]}, [r0] -_NEON2SSE_INLINE void vst4q_lane_u32_ptr(__transfersize(4) uint32_t * ptr, uint32x4x4_t* val4, __constrange(0,3) int lane) -{ - vst2q_lane_u32_ptr(ptr, (uint32x4x2_t*)val4->val, lane); - vst2q_lane_u32_ptr((ptr + 2), ((uint32x4x2_t*)val4->val + 1), lane); -} -#define vst4q_lane_u32(ptr, val, lane) vst4q_lane_u32_ptr(ptr, &val, lane) - -//void vst4q_lane_s16(__transfersize(4) int16_t * ptr, int16x8x4_t val, __constrange(0,7) int lane);// VST4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -void vst4q_lane_s16_ptr(__transfersize(4) int16_t * ptr, int16x8x4_t * val, __constrange(0,7) int lane); -#define vst4q_lane_s16(ptr,val,lane) vst4q_lane_u16((uint16_t *)ptr,val,lane) - -//void vst4q_lane_s32(__transfersize(4) int32_t * ptr, int32x4x4_t val, __constrange(0,3) int lane);// VST4.32 {d0[0], d2[0], d4[0], d6[0]}, [r0] -void vst4q_lane_s32_ptr(__transfersize(4) int32_t * ptr, int32x4x4_t * val, __constrange(0,3) int lane); -#define vst4q_lane_s32(ptr,val,lane) vst4q_lane_u32((uint32_t *)ptr,val,lane) - -//void vst4q_lane_f16(__transfersize(4) __fp16 * ptr, float16x8x4_t val, __constrange(0,7) int lane);// VST4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -void vst4q_lane_f16_ptr(__transfersize(4) __fp16 * ptr, float16x8x4_t * val, __constrange(0,7) int lane); -//current IA SIMD doesn't support float16 - -//void vst4q_lane_f32(__transfersize(4) float32_t * ptr, float32x4x4_t val, __constrange(0,3) int lane)// VST4.32 {d0[0], d2[0], d4[0], d6[0]}, [r0] -_NEON2SSE_INLINE void vst4q_lane_f32_ptr(__transfersize(4) float32_t * ptr, float32x4x4_t* val, __constrange(0,3) int lane) -{ - vst1q_lane_f32(ptr, val->val[0], lane); - vst1q_lane_f32((ptr + 1), val->val[1], lane); - vst1q_lane_f32((ptr + 2), val->val[2], lane); - vst1q_lane_f32((ptr + 3), val->val[3], lane); -} -#define vst4q_lane_f32(ptr,val,lane) vst4q_lane_f32_ptr(ptr,&val,lane) - -//void vst4q_lane_p16(__transfersize(4) poly16_t * ptr, poly16x8x4_t val, __constrange(0,7) int lane);// VST4.16 {d0[0], d2[0], d4[0], d6[0]}, [r0] -void vst4q_lane_p16_ptr(__transfersize(4) poly16_t * ptr, poly16x8x4_t * val, __constrange(0,7) int lane); -#define vst4q_lane_p16 vst4q_lane_u16 - -//void vst4_lane_u8(__transfersize(4) uint8_t * ptr, uint8x8x4_t val, __constrange(0,7) int lane)// VST4.8 {d0[0], d1[0], d2[0], d3[0]}, [r0] -_NEON2SSE_INLINE void vst4_lane_u8_ptr(__transfersize(4) uint8_t * ptr, uint8x8x4_t* val, __constrange(0,7) int lane) -{ - *(ptr) = val->val[0].m64_u8[lane]; - *(ptr + 1) = val->val[1].m64_u8[lane]; - *(ptr + 2) = val->val[2].m64_u8[lane]; - *(ptr + 3) = val->val[3].m64_u8[lane]; -} -#define vst4_lane_u8(ptr, val, lane) vst4_lane_u8_ptr(ptr, &val, lane) - -//void vst4_lane_u16(__transfersize(4) uint16_t * ptr, uint16x4x4_t val, __constrange(0,3) int lane)// VST4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -_NEON2SSE_INLINE void vst4_lane_u16_ptr(__transfersize(4) uint16_t * ptr, uint16x4x4_t* val, __constrange(0,3) int lane) -{ - *(ptr) = val->val[0].m64_u16[lane]; - *(ptr + 1) = val->val[1].m64_u16[lane]; - *(ptr + 2) = val->val[2].m64_u16[lane]; - *(ptr + 3) = val->val[3].m64_u16[lane]; -} -#define vst4_lane_u16(ptr, val, lane) vst4_lane_u16_ptr(ptr, &val, lane) - -//void vst4_lane_u32(__transfersize(4) uint32_t * ptr, uint32x2x4_t val, __constrange(0,1) int lane)// VST4.32 {d0[0], d1[0], d2[0], d3[0]}, [r0] -_NEON2SSE_INLINE void vst4_lane_u32_ptr(__transfersize(4) uint32_t * ptr, uint32x2x4_t* val, __constrange(0,1) int lane) -{ - *(ptr) = val->val[0].m64_u32[lane]; - *(ptr + 1) = val->val[1].m64_u32[lane]; - *(ptr + 2) = val->val[2].m64_u32[lane]; - *(ptr + 3) = val->val[3].m64_u32[lane]; -} -#define vst4_lane_u32(ptr, val, lane) vst4_lane_u32_ptr(ptr, &val, lane) - -//void vst4_lane_s8(__transfersize(4) int8_t * ptr, int8x8x4_t val, __constrange(0,7) int lane)// VST4.8 {d0[0], d1[0], d2[0], d3[0]}, [r0] -#define vst4_lane_s8(ptr, val, lane) vst4_lane_u8((uint8_t*)ptr, val, lane) - -//void vst4_lane_s16(__transfersize(4) int16_t * ptr, int16x4x4_t val, __constrange(0,3) int lane)// VST4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -#define vst4_lane_s16(ptr, val, lane) vst4_lane_u16((uint16_t*)ptr, val, lane) - -//void vst4_lane_s32(__transfersize(4) int32_t * ptr, int32x2x4_t val, __constrange(0,1) int lane)// VST4.32 {d0[0], d1[0], d2[0], d3[0]}, [r0] -#define vst4_lane_s32(ptr, val, lane) vst4_lane_u32((uint32_t*)ptr, val, lane) - -//void vst4_lane_f16(__transfersize(4) __fp16 * ptr, float16x4x4_t val, __constrange(0,3) int lane);// VST4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_f16_ptr(__transfersize(4) __fp16 * ptr, float16x4x4_t * val, __constrange(0,3) int lane); -//current IA SIMD doesn't support float16 - -void vst4_lane_f32_ptr(__transfersize(4) float32_t * ptr, float32x2x4_t * val, __constrange(0,1) int lane); // VST4.32 {d0[0], d1[0], d2[0], d3[0]}, [r0] -_NEON2SSE_INLINE void vst4_lane_f32_ptr(__transfersize(4) float32_t * ptr, float32x2x4_t* val, __constrange(0,1) int lane) -{ - *(ptr) = val->val[0].m64_f32[lane]; - *(ptr + 1) = val->val[1].m64_f32[lane]; - *(ptr + 2) = val->val[2].m64_f32[lane]; - *(ptr + 3) = val->val[3].m64_f32[lane]; -} -#define vst4_lane_f32(ptr,val,lane) vst4_lane_f32_ptr(ptr,&val,lane) - -//void vst4_lane_p8(__transfersize(4) poly8_t * ptr, poly8x8x4_t val, __constrange(0,7) int lane);// VST4.8 {d0[0], d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_p8_ptr(__transfersize(4) poly8_t * ptr, poly8x8x4_t * val, __constrange(0,7) int lane); -#define vst4_lane_p8 vst4_lane_u8 - -//void vst4_lane_p16(__transfersize(4) poly16_t * ptr, poly16x4x4_t val, __constrange(0,3) int lane);// VST4.16 {d0[0], d1[0], d2[0], d3[0]}, [r0] -void vst4_lane_p16_ptr(__transfersize(4) poly16_t * ptr, poly16x4x4_t * val, __constrange(0,3) int lane); -#define vst4_lane_p16 vst4_lane_u16 - -//************************************************************************************************** -//************************ Extract lanes from a vector ******************************************** -//************************************************************************************************** -//These intrinsics extract a single lane (element) from a vector. -uint8_t vget_lane_u8(uint8x8_t vec, __constrange(0,7) int lane); // VMOV.U8 r0, d0[0] -#define vget_lane_u8(vec, lane) vec.m64_u8[lane] - -uint16_t vget_lane_u16(uint16x4_t vec, __constrange(0,3) int lane); // VMOV.s16 r0, d0[0] -#define vget_lane_u16(vec, lane) vec.m64_u16[lane] - - -uint32_t vget_lane_u32(uint32x2_t vec, __constrange(0,1) int lane); // VMOV.32 r0, d0[0] -#define vget_lane_u32(vec, lane) vec.m64_u32[lane] - -int8_t vget_lane_s8(int8x8_t vec, __constrange(0,7) int lane); // VMOV.S8 r0, d0[0] -#define vget_lane_s8(vec, lane) vec.m64_i8[lane] - -int16_t vget_lane_s16(int16x4_t vec, __constrange(0,3) int lane); // VMOV.S16 r0, d0[0] -#define vget_lane_s16(vec, lane) vec.m64_i16[lane] - -int32_t vget_lane_s32(int32x2_t vec, __constrange(0,1) int lane); // VMOV.32 r0, d0[0] -#define vget_lane_s32(vec, lane) vec.m64_i32[lane] - -poly8_t vget_lane_p8(poly8x8_t vec, __constrange(0,7) int lane); // VMOV.U8 r0, d0[0] -#define vget_lane_p8 vget_lane_u8 - -poly16_t vget_lane_p16(poly16x4_t vec, __constrange(0,3) int lane); // VMOV.s16 r0, d0[0] -#define vget_lane_p16 vget_lane_u16 - -float32_t vget_lane_f32(float32x2_t vec, __constrange(0,1) int lane); // VMOV.32 r0, d0[0] -#define vget_lane_f32(vec, lane) vec.m64_f32[lane] - -uint8_t vgetq_lane_u8(uint8x16_t vec, __constrange(0,15) int lane); // VMOV.U8 r0, d0[0] -#define vgetq_lane_u8 _MM_EXTRACT_EPI8 - -uint16_t vgetq_lane_u16(uint16x8_t vec, __constrange(0,7) int lane); // VMOV.s16 r0, d0[0] -#define vgetq_lane_u16 _MM_EXTRACT_EPI16 - -uint32_t vgetq_lane_u32(uint32x4_t vec, __constrange(0,3) int lane); // VMOV.32 r0, d0[0] -#define vgetq_lane_u32 _MM_EXTRACT_EPI32 - -int8_t vgetq_lane_s8(int8x16_t vec, __constrange(0,15) int lane); // VMOV.S8 r0, d0[0] -#define vgetq_lane_s8 vgetq_lane_u8 - -int16_t vgetq_lane_s16(int16x8_t vec, __constrange(0,7) int lane); // VMOV.S16 r0, d0[0] -#define vgetq_lane_s16 vgetq_lane_u16 - -int32_t vgetq_lane_s32(int32x4_t vec, __constrange(0,3) int lane); // VMOV.32 r0, d0[0] -#define vgetq_lane_s32 vgetq_lane_u32 - -poly8_t vgetq_lane_p8(poly8x16_t vec, __constrange(0,15) int lane); // VMOV.U8 r0, d0[0] -#define vgetq_lane_p8 vgetq_lane_u8 - -poly16_t vgetq_lane_p16(poly16x8_t vec, __constrange(0,7) int lane); // VMOV.s16 r0, d0[0] -#define vgetq_lane_p16 vgetq_lane_u16 - -float32_t vgetq_lane_f32(float32x4_t vec, __constrange(0,3) int lane); // VMOV.32 r0, d0[0] -_NEON2SSE_INLINE float32_t vgetq_lane_f32(float32x4_t vec, __constrange(0,3) int lane) -{ - int32_t ilane; - ilane = _MM_EXTRACT_PS(vec,lane); - return *(float*)&ilane; -} - -int64_t vget_lane_s64(int64x1_t vec, __constrange(0,0) int lane); // VMOV r0,r0,d0 -#define vget_lane_s64(vec, lane) vec.m64_i64[0] - -uint64_t vget_lane_u64(uint64x1_t vec, __constrange(0,0) int lane); // VMOV r0,r0,d0 -#define vget_lane_u64(vec, lane) vec.m64_u64[0] - - -int64_t vgetq_lane_s64(int64x2_t vec, __constrange(0,1) int lane); // VMOV r0,r0,d0 -#define vgetq_lane_s64 (int64_t) vgetq_lane_u64 - -uint64_t vgetq_lane_u64(uint64x2_t vec, __constrange(0,1) int lane); // VMOV r0,r0,d0 -#define vgetq_lane_u64 _MM_EXTRACT_EPI64 - -// ***************** Set lanes within a vector ******************************************** -// ************************************************************************************** -//These intrinsics set a single lane (element) within a vector. -//same functions as vld1_lane_xx ones, but take the value to be set directly. - -uint8x8_t vset_lane_u8(uint8_t value, uint8x8_t vec, __constrange(0,7) int lane); // VMOV.8 d0[0],r0 -_NEON2SSE_INLINE uint8x8_t vset_lane_u8(uint8_t value, uint8x8_t vec, __constrange(0,7) int lane) -{ - uint8_t val; - val = value; - return vld1_lane_u8(&val, vec, lane); -} - -uint16x4_t vset_lane_u16(uint16_t value, uint16x4_t vec, __constrange(0,3) int lane); // VMOV.16 d0[0],r0 -_NEON2SSE_INLINE uint16x4_t vset_lane_u16(uint16_t value, uint16x4_t vec, __constrange(0,3) int lane) -{ - uint16_t val; - val = value; - return vld1_lane_u16(&val, vec, lane); -} - -uint32x2_t vset_lane_u32(uint32_t value, uint32x2_t vec, __constrange(0,1) int lane); // VMOV.32 d0[0],r0 -_NEON2SSE_INLINE uint32x2_t vset_lane_u32(uint32_t value, uint32x2_t vec, __constrange(0,1) int lane) -{ - uint32_t val; - val = value; - return vld1_lane_u32(&val, vec, lane); -} - -int8x8_t vset_lane_s8(int8_t value, int8x8_t vec, __constrange(0,7) int lane); // VMOV.8 d0[0],r0 -_NEON2SSE_INLINE int8x8_t vset_lane_s8(int8_t value, int8x8_t vec, __constrange(0,7) int lane) -{ - int8_t val; - val = value; - return vld1_lane_s8(&val, vec, lane); -} - -int16x4_t vset_lane_s16(int16_t value, int16x4_t vec, __constrange(0,3) int lane); // VMOV.16 d0[0],r0 -_NEON2SSE_INLINE int16x4_t vset_lane_s16(int16_t value, int16x4_t vec, __constrange(0,3) int lane) -{ - int16_t val; - val = value; - return vld1_lane_s16(&val, vec, lane); -} - -int32x2_t vset_lane_s32(int32_t value, int32x2_t vec, __constrange(0,1) int lane); // VMOV.32 d0[0],r0 -_NEON2SSE_INLINE int32x2_t vset_lane_s32(int32_t value, int32x2_t vec, __constrange(0,1) int lane) -{ - int32_t val; - val = value; - return vld1_lane_s32(&val, vec, lane); -} - -poly8x8_t vset_lane_p8(poly8_t value, poly8x8_t vec, __constrange(0,7) int lane); // VMOV.8 d0[0],r0 -#define vset_lane_p8 vset_lane_u8 - -poly16x4_t vset_lane_p16(poly16_t value, poly16x4_t vec, __constrange(0,3) int lane); // VMOV.16 d0[0],r0 -#define vset_lane_p16 vset_lane_u16 - -float32x2_t vset_lane_f32(float32_t value, float32x2_t vec, __constrange(0,1) int lane); // VMOV.32 d0[0],r0 -_NEON2SSE_INLINE float32x2_t vset_lane_f32(float32_t value, float32x2_t vec, __constrange(0,1) int lane) -{ - float32_t val; - val = value; - return vld1_lane_f32(&val, vec, lane); -} - -uint8x16_t vsetq_lane_u8(uint8_t value, uint8x16_t vec, __constrange(0,15) int lane); // VMOV.8 d0[0],r0 -_NEON2SSE_INLINE uint8x16_t vsetq_lane_u8(uint8_t value, uint8x16_t vec, __constrange(0,15) int lane) -{ - uint8_t val; - val = value; - return vld1q_lane_u8(&val, vec, lane); -} - -uint16x8_t vsetq_lane_u16(uint16_t value, uint16x8_t vec, __constrange(0,7) int lane); // VMOV.16 d0[0],r0 -_NEON2SSE_INLINE uint16x8_t vsetq_lane_u16(uint16_t value, uint16x8_t vec, __constrange(0,7) int lane) -{ - uint16_t val; - val = value; - return vld1q_lane_u16(&val, vec, lane); -} - -uint32x4_t vsetq_lane_u32(uint32_t value, uint32x4_t vec, __constrange(0,3) int lane); // VMOV.32 d0[0],r0 -_NEON2SSE_INLINE uint32x4_t vsetq_lane_u32(uint32_t value, uint32x4_t vec, __constrange(0,3) int lane) -{ - uint32_t val; - val = value; - return vld1q_lane_u32(&val, vec, lane); -} - -int8x16_t vsetq_lane_s8(int8_t value, int8x16_t vec, __constrange(0,15) int lane); // VMOV.8 d0[0],r0 -_NEON2SSE_INLINE int8x16_t vsetq_lane_s8(int8_t value, int8x16_t vec, __constrange(0,15) int lane) -{ - int8_t val; - val = value; - return vld1q_lane_s8(&val, vec, lane); -} - -int16x8_t vsetq_lane_s16(int16_t value, int16x8_t vec, __constrange(0,7) int lane); // VMOV.16 d0[0],r0 -_NEON2SSE_INLINE int16x8_t vsetq_lane_s16(int16_t value, int16x8_t vec, __constrange(0,7) int lane) -{ - int16_t val; - val = value; - return vld1q_lane_s16(&val, vec, lane); -} - -int32x4_t vsetq_lane_s32(int32_t value, int32x4_t vec, __constrange(0,3) int lane); // VMOV.32 d0[0],r0 -_NEON2SSE_INLINE int32x4_t vsetq_lane_s32(int32_t value, int32x4_t vec, __constrange(0,3) int lane) -{ - int32_t val; - val = value; - return vld1q_lane_s32(&val, vec, lane); -} - -poly8x16_t vsetq_lane_p8(poly8_t value, poly8x16_t vec, __constrange(0,15) int lane); // VMOV.8 d0[0],r0 -#define vsetq_lane_p8 vsetq_lane_u8 - -poly16x8_t vsetq_lane_p16(poly16_t value, poly16x8_t vec, __constrange(0,7) int lane); // VMOV.16 d0[0],r0 -#define vsetq_lane_p16 vsetq_lane_u16 - -float32x4_t vsetq_lane_f32(float32_t value, float32x4_t vec, __constrange(0,3) int lane); // VMOV.32 d0[0],r0 -_NEON2SSE_INLINE float32x4_t vsetq_lane_f32(float32_t value, float32x4_t vec, __constrange(0,3) int lane) -{ - float32_t val; - val = value; - return vld1q_lane_f32(&val, vec, lane); -} - -int64x1_t vset_lane_s64(int64_t value, int64x1_t vec, __constrange(0,0) int lane); // VMOV d0,r0,r0 -_NEON2SSE_INLINE int64x1_t vset_lane_s64(int64_t value, int64x1_t vec, __constrange(0,0) int lane) -{ - int64_t val; - val = value; - return vld1_lane_s64(&val, vec, lane); -} - -uint64x1_t vset_lane_u64(uint64_t value, uint64x1_t vec, __constrange(0,0) int lane); // VMOV d0,r0,r0 -_NEON2SSE_INLINE uint64x1_t vset_lane_u64(uint64_t value, uint64x1_t vec, __constrange(0,0) int lane) -{ - uint64_t val; - val = value; - return vld1_lane_u64(&val, vec, lane); -} - -int64x2_t vsetq_lane_s64(int64_t value, int64x2_t vec, __constrange(0,1) int lane); // VMOV d0,r0,r0 -_NEON2SSE_INLINE int64x2_t vsetq_lane_s64(int64_t value, int64x2_t vec, __constrange(0,1) int lane) -{ - uint64_t val; - val = value; - return vld1q_lane_s64(&val, vec, lane); -} - -uint64x2_t vsetq_lane_u64(uint64_t value, uint64x2_t vec, __constrange(0,1) int lane); // VMOV d0,r0,r0 -#define vsetq_lane_u64 vsetq_lane_s64 - -// ******************************************************************************* -// **************** Initialize a vector from bit pattern *************************** -// ******************************************************************************* -//These intrinsics create a vector from a literal bit pattern. -int8x8_t vcreate_s8(uint64_t a); // VMOV d0,r0,r0 -#define vcreate_s8(a) (*(__m64_128*)&(a)) - - -int16x4_t vcreate_s16(uint64_t a); // VMOV d0,r0,r0 -#define vcreate_s16 vcreate_s8 - -int32x2_t vcreate_s32(uint64_t a); // VMOV d0,r0,r0 -#define vcreate_s32 vcreate_s8 - -float16x4_t vcreate_f16(uint64_t a); // VMOV d0,r0,r0 -//no IA32 SIMD avalilable - -float32x2_t vcreate_f32(uint64_t a); // VMOV d0,r0,r0 -#define vcreate_f32(a) (*(__m64_128*)&(a)) - -uint8x8_t vcreate_u8(uint64_t a); // VMOV d0,r0,r0 -#define vcreate_u8 vcreate_s8 - -uint16x4_t vcreate_u16(uint64_t a); // VMOV d0,r0,r0 -#define vcreate_u16 vcreate_s16 - -uint32x2_t vcreate_u32(uint64_t a); // VMOV d0,r0,r0 -#define vcreate_u32 vcreate_s32 - -uint64x1_t vcreate_u64(uint64_t a); // VMOV d0,r0,r0 -#define vcreate_u64 vcreate_s8 - - -poly8x8_t vcreate_p8(uint64_t a); // VMOV d0,r0,r0 -#define vcreate_p8 vcreate_u8 - -poly16x4_t vcreate_p16(uint64_t a); // VMOV d0,r0,r0 -#define vcreate_p16 vcreate_u16 - -int64x1_t vcreate_s64(uint64_t a); // VMOV d0,r0,r0 -#define vcreate_s64 vcreate_u64 - -//********************* Set all lanes to same value ******************************** -//********************************************************************************* -//These intrinsics set all lanes to the same value. -uint8x8_t vdup_n_u8(uint8_t value); // VDUP.8 d0,r0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint8x8_t vdup_n_u8(uint8_t value), _NEON2SSE_REASON_SLOW_SERIAL) -{ - uint8x8_t res; - int i; - for (i = 0; i<8; i++) { - res.m64_u8[i] = value; - } - return res; -} - -uint16x4_t vdup_n_u16(uint16_t value); // VDUP.16 d0,r0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint16x4_t vdup_n_u16(uint16_t value), _NEON2SSE_REASON_SLOW_SERIAL) -{ - uint16x4_t res; - int i; - for (i = 0; i<4; i++) { - res.m64_u16[i] = value; - } - return res; -} - -uint32x2_t vdup_n_u32(uint32_t value); // VDUP.32 d0,r0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(uint32x2_t vdup_n_u32(uint32_t value), _NEON2SSE_REASON_SLOW_SERIAL) -{ - uint32x2_t res; - res.m64_u32[0] = value; - res.m64_u32[1] = value; - return res; -} - -int8x8_t vdup_n_s8(int8_t value); // VDUP.8 d0,r0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int8x8_t vdup_n_s8(int8_t value), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int8x8_t res; - int i; - for (i = 0; i<8; i++) { - res.m64_i8[i] = value; - } - return res; -} - -int16x4_t vdup_n_s16(int16_t value); // VDUP.16 d0,r0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int16x4_t vdup_n_s16(int16_t value), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int16x4_t res; - int i; - for (i = 0; i<4; i++) { - res.m64_i16[i] = value; - } - return res; -} - -int32x2_t vdup_n_s32(int32_t value); // VDUP.32 d0,r0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vdup_n_s32(int32_t value), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int32x2_t res; - res.m64_i32[0] = value; - res.m64_i32[1] = value; - return res; -} - -poly8x8_t vdup_n_p8(poly8_t value); // VDUP.8 d0,r0 -#define vdup_n_p8 vdup_n_u8 - -poly16x4_t vdup_n_p16(poly16_t value); // VDUP.16 d0,r0 -#define vdup_n_p16 vdup_n_s16 - -float32x2_t vdup_n_f32(float32_t value); // VDUP.32 d0,r0 -_NEON2SSE_INLINE float32x2_t vdup_n_f32(float32_t value) -{ - float32x2_t res; - res.m64_f32[0] = value; - res.m64_f32[1] = value; - return res; -} - -uint8x16_t vdupq_n_u8(uint8_t value); // VDUP.8 q0,r0 -#define vdupq_n_u8(value) _mm_set1_epi8((uint8_t) (value)) - -uint16x8_t vdupq_n_u16(uint16_t value); // VDUP.16 q0,r0 -#define vdupq_n_u16(value) _mm_set1_epi16((uint16_t) (value)) - -uint32x4_t vdupq_n_u32(uint32_t value); // VDUP.32 q0,r0 -#define vdupq_n_u32(value) _mm_set1_epi32((uint32_t) (value)) - -int8x16_t vdupq_n_s8(int8_t value); // VDUP.8 q0,r0 -#define vdupq_n_s8 _mm_set1_epi8 - -int16x8_t vdupq_n_s16(int16_t value); // VDUP.16 q0,r0 -#define vdupq_n_s16 _mm_set1_epi16 - -int32x4_t vdupq_n_s32(int32_t value); // VDUP.32 q0,r0 -#define vdupq_n_s32 _mm_set1_epi32 - -poly8x16_t vdupq_n_p8(poly8_t value); // VDUP.8 q0,r0 -#define vdupq_n_p8 vdupq_n_u8 - -poly16x8_t vdupq_n_p16(poly16_t value); // VDUP.16 q0,r0 -#define vdupq_n_p16 vdupq_n_u16 - -float32x4_t vdupq_n_f32(float32_t value); // VDUP.32 q0,r0 -#define vdupq_n_f32 _mm_set1_ps - -int64x1_t vdup_n_s64(int64_t value); // VMOV d0,r0,r0 -_NEON2SSE_INLINE int64x1_t vdup_n_s64(int64_t value) -{ - int64x1_t res; - res.m64_i64[0] = value; - return res; -} - -uint64x1_t vdup_n_u64(uint64_t value); // VMOV d0,r0,r0 -_NEON2SSE_INLINE uint64x1_t vdup_n_u64(uint64_t value) -{ - uint64x1_t res; - res.m64_u64[0] = value; - return res; -} - -int64x2_t vdupq_n_s64(int64_t value); // VMOV d0,r0,r0 -_NEON2SSE_INLINE int64x2_t vdupq_n_s64(int64_t value) -{ - _NEON2SSE_ALIGN_16 int64_t value2[2] = {value, value}; //value may be an immediate - return LOAD_SI128(value2); -} - -uint64x2_t vdupq_n_u64(uint64_t value); // VMOV d0,r0,r0 -_NEON2SSE_INLINE uint64x2_t vdupq_n_u64(uint64_t value) -{ - _NEON2SSE_ALIGN_16 uint64_t val[2] = {value, value}; //value may be an immediate - return LOAD_SI128(val); -} - -//**** Set all lanes to same value ************************ -//Same functions as above - just aliaces.******************** -//Probably they reflect the fact that 128-bit functions versions use VMOV instruction ********** -uint8x8_t vmov_n_u8(uint8_t value); // VDUP.8 d0,r0 -#define vmov_n_u8 vdup_n_s8 - -uint16x4_t vmov_n_u16(uint16_t value); // VDUP.16 d0,r0 -#define vmov_n_u16 vdup_n_s16 - -uint32x2_t vmov_n_u32(uint32_t value); // VDUP.32 d0,r0 -#define vmov_n_u32 vdup_n_u32 - -int8x8_t vmov_n_s8(int8_t value); // VDUP.8 d0,r0 -#define vmov_n_s8 vdup_n_s8 - -int16x4_t vmov_n_s16(int16_t value); // VDUP.16 d0,r0 -#define vmov_n_s16 vdup_n_s16 - -int32x2_t vmov_n_s32(int32_t value); // VDUP.32 d0,r0 -#define vmov_n_s32 vdup_n_s32 - -poly8x8_t vmov_n_p8(poly8_t value); // VDUP.8 d0,r0 -#define vmov_n_p8 vdup_n_u8 - -poly16x4_t vmov_n_p16(poly16_t value); // VDUP.16 d0,r0 -#define vmov_n_p16 vdup_n_s16 - -float32x2_t vmov_n_f32(float32_t value); // VDUP.32 d0,r0 -#define vmov_n_f32 vdup_n_f32 - -uint8x16_t vmovq_n_u8(uint8_t value); // VDUP.8 q0,r0 -#define vmovq_n_u8 vdupq_n_u8 - -uint16x8_t vmovq_n_u16(uint16_t value); // VDUP.16 q0,r0 -#define vmovq_n_u16 vdupq_n_s16 - -uint32x4_t vmovq_n_u32(uint32_t value); // VDUP.32 q0,r0 -#define vmovq_n_u32 vdupq_n_u32 - -int8x16_t vmovq_n_s8(int8_t value); // VDUP.8 q0,r0 -#define vmovq_n_s8 vdupq_n_s8 - -int16x8_t vmovq_n_s16(int16_t value); // VDUP.16 q0,r0 -#define vmovq_n_s16 vdupq_n_s16 - -int32x4_t vmovq_n_s32(int32_t value); // VDUP.32 q0,r0 -#define vmovq_n_s32 vdupq_n_s32 - -poly8x16_t vmovq_n_p8(poly8_t value); // VDUP.8 q0,r0 -#define vmovq_n_p8 vdupq_n_u8 - -poly16x8_t vmovq_n_p16(poly16_t value); // VDUP.16 q0,r0 -#define vmovq_n_p16 vdupq_n_s16 - -float32x4_t vmovq_n_f32(float32_t value); // VDUP.32 q0,r0 -#define vmovq_n_f32 vdupq_n_f32 - -int64x1_t vmov_n_s64(int64_t value); // VMOV d0,r0,r0 -#define vmov_n_s64 vdup_n_s64 - -uint64x1_t vmov_n_u64(uint64_t value); // VMOV d0,r0,r0 -#define vmov_n_u64 vdup_n_u64 - -int64x2_t vmovq_n_s64(int64_t value); // VMOV d0,r0,r0 -#define vmovq_n_s64 vdupq_n_s64 - -uint64x2_t vmovq_n_u64(uint64_t value); // VMOV d0,r0,r0 -#define vmovq_n_u64 vdupq_n_u64 - -//**************Set all lanes to the value of one lane of a vector ************* -//**************************************************************************** -//here shuffle is better solution than lane extraction followed by set1 function -uint8x8_t vdup_lane_u8(uint8x8_t vec, __constrange(0,7) int lane); // VDUP.8 d0,d0[0] -_NEON2SSE_INLINE uint8x8_t vdup_lane_u8(uint8x8_t vec, __constrange(0,7) int lane) -{ - uint8x8_t res; - uint8_t valane; - int i = 0; - valane = vec.m64_u8[lane]; - for (i = 0; i<8; i++) { - res.m64_u8[i] = valane; - } - return res; -} - -uint16x4_t vdup_lane_u16(uint16x4_t vec, __constrange(0,3) int lane); // VDUP.16 d0,d0[0] -_NEON2SSE_INLINE uint16x4_t vdup_lane_u16(uint16x4_t vec, __constrange(0,3) int lane) -{ - uint16x4_t res; - uint16_t valane; - valane = vec.m64_u16[lane]; - res.m64_u16[0] = valane; - res.m64_u16[1] = valane; - res.m64_u16[2] = valane; - res.m64_u16[3] = valane; - return res; -} - -uint32x2_t vdup_lane_u32(uint32x2_t vec, __constrange(0,1) int lane); // VDUP.32 d0,d0[0] -_NEON2SSE_INLINE uint32x2_t vdup_lane_u32(uint32x2_t vec, __constrange(0,1) int lane) -{ - uint32x2_t res; - res.m64_u32[0] = vec.m64_u32[lane]; - res.m64_u32[1] = res.m64_u32[0]; - return res; -} - -int8x8_t vdup_lane_s8(int8x8_t vec, __constrange(0,7) int lane); // VDUP.8 d0,d0[0] -#define vdup_lane_s8 vdup_lane_u8 - -int16x4_t vdup_lane_s16(int16x4_t vec, __constrange(0,3) int lane); // VDUP.16 d0,d0[0] -#define vdup_lane_s16 vdup_lane_u16 - -int32x2_t vdup_lane_s32(int32x2_t vec, __constrange(0,1) int lane); // VDUP.32 d0,d0[0] -#define vdup_lane_s32 vdup_lane_u32 - -poly8x8_t vdup_lane_p8(poly8x8_t vec, __constrange(0,7) int lane); // VDUP.8 d0,d0[0] -#define vdup_lane_p8 vdup_lane_u8 - -poly16x4_t vdup_lane_p16(poly16x4_t vec, __constrange(0,3) int lane); // VDUP.16 d0,d0[0] -#define vdup_lane_p16 vdup_lane_s16 - -float32x2_t vdup_lane_f32(float32x2_t vec, __constrange(0,1) int lane); // VDUP.32 d0,d0[0] -_NEON2SSE_INLINE float32x2_t vdup_lane_f32(float32x2_t vec, __constrange(0,1) int lane) -{ - float32x2_t res; - res.m64_f32[0] = vec.m64_f32[lane]; - res.m64_f32[1] = res.m64_f32[0]; - return res; -} - -uint8x16_t vdupq_lane_u8(uint8x8_t vec, __constrange(0,7) int lane); // VDUP.8 q0,d0[0] -_NEON2SSE_INLINE uint8x16_t vdupq_lane_u8(uint8x8_t vec, __constrange(0,7) int lane) // VDUP.8 q0,d0[0] -{ - _NEON2SSE_ALIGN_16 int8_t lanemask8[16] = {lane, lane, lane, lane, lane, lane, lane, lane, lane, lane, lane, lane, lane, lane, lane, lane}; - return _mm_shuffle_epi8 (_pM128i(vec), *(__m128i*) lanemask8); -} - -uint16x8_t vdupq_lane_u16(uint16x4_t vec, __constrange(0,3) int lane); // VDUP.16 q0,d0[0] -_NEON2SSE_INLINE uint16x8_t vdupq_lane_u16(uint16x4_t vec, __constrange(0,3) int lane) // VDUP.16 q0,d0[0] -{ - //we could use 8bit shuffle for 16 bit as well - const int8_t lane16 = ((int8_t) lane) << 1; - _NEON2SSE_ALIGN_16 int8_t lanemask_e16[16] = {lane16, lane16 + 1, lane16, lane16 + 1, lane16, lane16 + 1, lane16, lane16 + 1, - lane16, lane16 + 1, lane16, lane16 + 1, lane16, lane16 + 1, lane16, lane16 + 1}; - return _mm_shuffle_epi8 (_pM128i(vec), *(__m128i*)lanemask_e16); -} - -uint32x4_t vdupq_lane_u32(uint32x2_t vec, __constrange(0,1) int lane); // VDUP.32 q0,d0[0] -#define vdupq_lane_u32(vec, lane) _mm_shuffle_epi32 (_pM128i(vec), lane | (lane << 2) | (lane << 4) | (lane << 6)) - -int8x16_t vdupq_lane_s8(int8x8_t vec, __constrange(0,7) int lane); // VDUP.8 q0,d0[0] -#define vdupq_lane_s8 vdupq_lane_u8 - -int16x8_t vdupq_lane_s16(int16x4_t vec, __constrange(0,3) int lane); // VDUP.16 q0,d0[0] -#define vdupq_lane_s16 vdupq_lane_u16 - -int32x4_t vdupq_lane_s32(int32x2_t vec, __constrange(0,1) int lane); // VDUP.32 q0,d0[0] -#define vdupq_lane_s32 vdupq_lane_u32 - -poly8x16_t vdupq_lane_p8(poly8x8_t vec, __constrange(0,7) int lane); // VDUP.8 q0,d0[0] -#define vdupq_lane_p8 vdupq_lane_u8 - -poly16x8_t vdupq_lane_p16(poly16x4_t vec, __constrange(0,3) int lane); // VDUP.16 q0,d0[0] -#define vdupq_lane_p16 vdupq_lane_s16 - -float32x4_t vdupq_lane_f32(float32x2_t vec, __constrange(0,1) int lane); // VDUP.32 q0,d0[0] -#define vdupq_lane_f32(vec, lane) _mm_load1_ps((vec.m64_f32 + lane)) - -int64x1_t vdup_lane_s64(int64x1_t vec, __constrange(0,0) int lane); // VMOV d0,d0 -#define vdup_lane_s64(vec,lane) vec - -uint64x1_t vdup_lane_u64(uint64x1_t vec, __constrange(0,0) int lane); // VMOV d0,d0 -#define vdup_lane_u64(vec,lane) vec - -int64x2_t vdupq_lane_s64(int64x1_t vec, __constrange(0,0) int lane); // VMOV q0,q0 -_NEON2SSE_INLINE int64x2_t vdupq_lane_s64(int64x1_t vec, __constrange(0,0) int lane) -{ - __m128i vec128; - vec128 = _pM128i(vec); - return _mm_unpacklo_epi64(vec128,vec128); -} - -uint64x2_t vdupq_lane_u64(uint64x1_t vec, __constrange(0,0) int lane); // VMOV q0,q0 -#define vdupq_lane_u64 vdupq_lane_s64 - -// ******************************************************************** -// ******************** Combining vectors ***************************** -// ******************************************************************** -//These intrinsics join two 64 bit vectors into a single 128bit vector. -int8x16_t vcombine_s8(int8x8_t low, int8x8_t high); // VMOV d0,d0 -#define vcombine_s8(low, high) _mm_unpacklo_epi64 (_pM128i(low), _pM128i(high) ) - -int16x8_t vcombine_s16(int16x4_t low, int16x4_t high); // VMOV d0,d0 -#define vcombine_s16(low, high) _mm_unpacklo_epi64 (_pM128i(low), _pM128i(high) ) - -int32x4_t vcombine_s32(int32x2_t low, int32x2_t high); // VMOV d0,d0 -#define vcombine_s32(low, high) _mm_unpacklo_epi64 (_pM128i(low), _pM128i(high) ) - -int64x2_t vcombine_s64(int64x1_t low, int64x1_t high); // VMOV d0,d0 -#define vcombine_s64(low, high) _mm_unpacklo_epi64 (_pM128i(low), _pM128i(high) ) - -float16x8_t vcombine_f16(float16x4_t low, float16x4_t high); // VMOV d0,d0 -//current IA SIMD doesn't support float16 - -float32x4_t vcombine_f32(float32x2_t low, float32x2_t high); // VMOV d0,d0 -_NEON2SSE_INLINE float32x4_t vcombine_f32(float32x2_t low, float32x2_t high) -{ - __m128i res; - res = _mm_unpacklo_epi64(_pM128i(low), _pM128i(high) ); - return _M128(res); -} - -uint8x16_t vcombine_u8(uint8x8_t low, uint8x8_t high); // VMOV d0,d0 -#define vcombine_u8 vcombine_s8 - -uint16x8_t vcombine_u16(uint16x4_t low, uint16x4_t high); // VMOV d0,d0 -#define vcombine_u16 vcombine_s16 - -uint32x4_t vcombine_u32(uint32x2_t low, uint32x2_t high); // VMOV d0,d0 -#define vcombine_u32 vcombine_s32 - -uint64x2_t vcombine_u64(uint64x1_t low, uint64x1_t high); // VMOV d0,d0 -#define vcombine_u64 vcombine_s64 - -poly8x16_t vcombine_p8(poly8x8_t low, poly8x8_t high); // VMOV d0,d0 -#define vcombine_p8 vcombine_u8 - -poly16x8_t vcombine_p16(poly16x4_t low, poly16x4_t high); // VMOV d0,d0 -#define vcombine_p16 vcombine_u16 - -//********************************************************************** -//************************* Splitting vectors ************************** -//********************************************************************** -//**************** Get high part ****************************************** -//These intrinsics split a 128 bit vector into 2 component 64 bit vectors -int8x8_t vget_high_s8(int8x16_t a); // VMOV d0,d0 -_NEON2SSE_INLINE int8x8_t vget_high_s8(int8x16_t a) -{ - int8x8_t res64; - __m128i res; - res = _mm_unpackhi_epi64(a,a); //SSE2 - return64(res); -} - -int16x4_t vget_high_s16(int16x8_t a); // VMOV d0,d0 -_NEON2SSE_INLINE int16x4_t vget_high_s16(int16x8_t a) -{ - int16x4_t res64; - __m128i res; - res = _mm_unpackhi_epi64(a,a); //SSE2 - return64(res); -} - -int32x2_t vget_high_s32(int32x4_t a); // VMOV d0,d0 -_NEON2SSE_INLINE int32x2_t vget_high_s32(int32x4_t a) -{ - int32x2_t res64; - __m128i res; - res = _mm_unpackhi_epi64(a,a); //SSE2 - return64(res); -} - -int64x1_t vget_high_s64(int64x2_t a); // VMOV d0,d0 -_NEON2SSE_INLINE int64x1_t vget_high_s64(int64x2_t a) -{ - int64x1_t res64; - __m128i res; - res = _mm_unpackhi_epi64(a,a); //SSE2 - return64(res); -} - -float16x4_t vget_high_f16(float16x8_t a); // VMOV d0,d0 -// IA32 SIMD doesn't work with 16bit floats currently - -float32x2_t vget_high_f32(float32x4_t a); // VMOV d0,d0 -_NEON2SSE_INLINE float32x2_t vget_high_f32(float32x4_t a) -{ - __m128i res; - __m64_128 res64; - res = _mm_unpackhi_epi64(_M128i(a),_M128i(a)); - return64(res); -} - -uint8x8_t vget_high_u8(uint8x16_t a); // VMOV d0,d0 -#define vget_high_u8 vget_high_s8 - -uint16x4_t vget_high_u16(uint16x8_t a); // VMOV d0,d0 -#define vget_high_u16 vget_high_s16 - -uint32x2_t vget_high_u32(uint32x4_t a); // VMOV d0,d0 -#define vget_high_u32 vget_high_s32 - -uint64x1_t vget_high_u64(uint64x2_t a); // VMOV d0,d0 -#define vget_high_u64 vget_high_s64 - -poly8x8_t vget_high_p8(poly8x16_t a); // VMOV d0,d0 -#define vget_high_p8 vget_high_u8 - -poly16x4_t vget_high_p16(poly16x8_t a); // VMOV d0,d0 -#define vget_high_p16 vget_high_u16 - -//********************** Get low part ********************** -//********************************************************** -int8x8_t vget_low_s8(int8x16_t a); // VMOV d0,d0 -_NEON2SSE_INLINE int8x8_t vget_low_s8(int8x16_t a) // VMOV d0,d0 -{ - int16x4_t res64; - return64(a); -} - -int16x4_t vget_low_s16(int16x8_t a); // VMOV d0,d0 -_NEON2SSE_INLINE int16x4_t vget_low_s16(int16x8_t a) // VMOV d0,d0 -{ - int16x4_t res64; - return64(a); -} - -int32x2_t vget_low_s32(int32x4_t a); // VMOV d0,d0 -_NEON2SSE_INLINE int32x2_t vget_low_s32(int32x4_t a) // VMOV d0,d0 -{ - int32x2_t res64; - return64(a); -} - -int64x1_t vget_low_s64(int64x2_t a); // VMOV d0,d0 -_NEON2SSE_INLINE int64x1_t vget_low_s64(int64x2_t a) // VMOV d0,d0 -{ - int64x1_t res64; - return64 (a); -} - -float16x4_t vget_low_f16(float16x8_t a); // VMOV d0,d0 -// IA32 SIMD doesn't work with 16bit floats currently - -float32x2_t vget_low_f32(float32x4_t a); // VMOV d0,d0 -_NEON2SSE_INLINE float32x2_t vget_low_f32(float32x4_t a) -{ - float32x2_t res64; - _M64f(res64, a); - return res64; -} - -uint8x8_t vget_low_u8(uint8x16_t a); // VMOV d0,d0 -#define vget_low_u8 vget_low_s8 - -uint16x4_t vget_low_u16(uint16x8_t a); // VMOV d0,d0 -#define vget_low_u16 vget_low_s16 - -uint32x2_t vget_low_u32(uint32x4_t a); // VMOV d0,d0 -#define vget_low_u32 vget_low_s32 - -uint64x1_t vget_low_u64(uint64x2_t a); // VMOV d0,d0 -#define vget_low_u64 vget_low_s64 - -poly8x8_t vget_low_p8(poly8x16_t a); // VMOV d0,d0 -#define vget_low_p8 vget_low_u8 - -poly16x4_t vget_low_p16(poly16x8_t a); // VMOV d0,d0 -#define vget_low_p16 vget_low_s16 - -//************************************************************************** -//************************ Converting vectors ********************************** -//************************************************************************** -//************* Convert from float *************************************** -// need to set _MM_SET_ROUNDING_MODE ( x) accordingly -int32x2_t vcvt_s32_f32(float32x2_t a); // VCVT.S32.F32 d0, d0 -_NEON2SSE_INLINE int32x2_t vcvt_s32_f32(float32x2_t a) -{ - int32x2_t res64; - __m128i res; - res = _mm_cvttps_epi32(_pM128(a)); //use low 64 bits of result only - return64(res); -} - -uint32x2_t vcvt_u32_f32(float32x2_t a); // VCVT.U32.F32 d0, d0 -_NEON2SSE_INLINE uint32x2_t vcvt_u32_f32(float32x2_t a) -{ - //may be not effective compared with a serial SIMD solution - uint32x2_t res64; - __m128i res; - res = vcvtq_u32_f32(_pM128(a)); - return64(res); -} - -int32x4_t vcvtq_s32_f32(float32x4_t a); // VCVT.S32.F32 q0, q0 -#define vcvtq_s32_f32 _mm_cvttps_epi32 - -uint32x4_t vcvtq_u32_f32(float32x4_t a); // VCVT.U32.F32 q0, q0 -_NEON2SSE_INLINE uint32x4_t vcvtq_u32_f32(float32x4_t a) // VCVT.U32.F32 q0, q0 -{ - //No single instruction SSE solution but we could implement it as following: - __m128i resi; - __m128 zero, mask, a_pos, mask_f_max_si, res; - _NEON2SSE_ALIGN_16 int32_t c7fffffff[4] = {0x7fffffff, 0x7fffffff, 0x7fffffff, 0x7fffffff}; - zero = _mm_setzero_ps(); - mask = _mm_cmpgt_ps(a, zero); - a_pos = _mm_and_ps(a, mask); - mask_f_max_si = _mm_cmpgt_ps(a_pos,*(__m128*)c7fffffff); - res = _mm_sub_ps(a_pos, mask_f_max_si); //if the input fits to signed we don't subtract anything - resi = _mm_cvttps_epi32(res); - return _mm_add_epi32(resi, *(__m128i*)&mask_f_max_si); -} - -// ***** Convert to the fixed point with the number of fraction bits specified by b *********** -//************************************************************************************************* -int32x2_t vcvt_n_s32_f32(float32x2_t a, __constrange(1,32) int b); // VCVT.S32.F32 d0, d0, #32 -_NEON2SSE_INLINE int32x2_t vcvt_n_s32_f32(float32x2_t a, __constrange(1,32) int b) -{ - int32x2_t res64; - return64(vcvtq_n_s32_f32(_pM128(a),b)); -} - -uint32x2_t vcvt_n_u32_f32(float32x2_t a, __constrange(1,32) int b); // VCVT.U32.F32 d0, d0, #32 -_NEON2SSE_INLINE uint32x2_t vcvt_n_u32_f32(float32x2_t a, __constrange(1,32) int b) -{ - uint32x2_t res; - float convconst; - convconst = (float)((uint32_t)1 << b); - res.m64_u32[0] = (uint32_t) (a.m64_f32[0] * convconst); - res.m64_u32[1] = (uint32_t) (a.m64_f32[1] * convconst); - return res; -} - -int32x4_t vcvtq_n_s32_f32(float32x4_t a, __constrange(1,32) int b); // VCVT.S32.F32 q0, q0, #32 -_NEON2SSE_INLINE int32x4_t vcvtq_n_s32_f32(float32x4_t a, __constrange(1,32) int b) -{ - float convconst; - _NEON2SSE_ALIGN_16 uint32_t cmask[] = {0x80000000, 0x80000000, 0x80000000, 0x80000000}; - __m128 cconst128; - __m128i mask, res; - convconst = (float)(1 << b); - cconst128 = vdupq_n_f32(convconst); - res = _mm_cvttps_epi32(_mm_mul_ps(a,cconst128)); - mask = _mm_cmpeq_epi32 (res, *(__m128i*)cmask); - return _mm_xor_si128 (res, mask); //res saturated for 0x80000000 -} - -uint32x4_t vcvtq_n_u32_f32(float32x4_t a, __constrange(1,32) int b); // VCVT.U32.F32 q0, q0, #32 -_NEON2SSE_INLINE uint32x4_t vcvtq_n_u32_f32(float32x4_t a, __constrange(1,32) int b) -{ - float convconst; - __m128 cconst128; - convconst = (float)(1 << b); - cconst128 = vdupq_n_f32(convconst); - return vcvtq_u32_f32(_mm_mul_ps(a,cconst128)); -} - -//***************** Convert to float ************************* -//************************************************************* -float32x2_t vcvt_f32_s32(int32x2_t a); // VCVT.F32.S32 d0, d0 -_NEON2SSE_INLINE float32x2_t vcvt_f32_s32(int32x2_t a) //use low 64 bits -{ - float32x2_t res; - res.m64_f32[0] = (float) a.m64_i32[0]; - res.m64_f32[1] = (float) a.m64_i32[1]; - return res; -} - -float32x2_t vcvt_f32_u32(uint32x2_t a); // VCVT.F32.U32 d0, d0 -_NEON2SSE_INLINE float32x2_t vcvt_f32_u32(uint32x2_t a) -{ - float32x2_t res; - res.m64_f32[0] = (float) a.m64_u32[0]; - res.m64_f32[1] = (float) a.m64_u32[1]; - return res; -} - -float32x4_t vcvtq_f32_s32(int32x4_t a); // VCVT.F32.S32 q0, q0 -#define vcvtq_f32_s32(a) _mm_cvtepi32_ps(a) - -float32x4_t vcvtq_f32_u32(uint32x4_t a); // VCVT.F32.U32 q0, q0 -_NEON2SSE_INLINE float32x4_t vcvtq_f32_u32(uint32x4_t a) // VCVT.F32.U32 q0, q0 -{ - //solution may be not optimal - __m128 two16, fHi, fLo; - __m128i hi, lo; - two16 = _mm_set1_ps((float)0x10000); //2^16 - // Avoid double rounding by doing two exact conversions - // of high and low 16-bit segments - hi = _mm_srli_epi32(a, 16); - lo = _mm_srli_epi32(_mm_slli_epi32(a, 16), 16); - fHi = _mm_mul_ps(_mm_cvtepi32_ps(hi), two16); - fLo = _mm_cvtepi32_ps(lo); - // do single rounding according to current rounding mode - return _mm_add_ps(fHi, fLo); -} - -// ***** Convert to the float from fixed point with the number of fraction bits specified by b *********** -float32x2_t vcvt_n_f32_s32(int32x2_t a, __constrange(1,32) int b); // VCVT.F32.S32 d0, d0, #32 -_NEON2SSE_INLINE float32x2_t vcvt_n_f32_s32(int32x2_t a, __constrange(1,32) int b) -{ - float32x2_t res; - float convconst; - convconst = (float)(1. / ((uint32_t)1 << b)); - res.m64_f32[0] = a.m64_i32[0] * convconst; - res.m64_f32[1] = a.m64_i32[1] * convconst; - return res; -} - -float32x2_t vcvt_n_f32_u32(uint32x2_t a, __constrange(1,32) int b); // VCVT.F32.U32 d0, d0, #32 -_NEON2SSE_INLINE float32x2_t vcvt_n_f32_u32(uint32x2_t a, __constrange(1,32) int b) // VCVT.F32.U32 d0, d0, #32 -{ - float32x2_t res; - float convconst; - convconst = (float)(1. / ((uint32_t)1 << b)); - res.m64_f32[0] = a.m64_u32[0] * convconst; - res.m64_f32[1] = a.m64_u32[1] * convconst; - return res; -} - -float32x4_t vcvtq_n_f32_s32(int32x4_t a, __constrange(1,32) int b); // VCVT.F32.S32 q0, q0, #32 -_NEON2SSE_INLINE float32x4_t vcvtq_n_f32_s32(int32x4_t a, __constrange(1,32) int b) -{ - float convconst; - __m128 cconst128, af; - convconst = (float)(1. / ((uint32_t)1 << b)); - af = _mm_cvtepi32_ps(a); - cconst128 = vdupq_n_f32(convconst); - return _mm_mul_ps(af,cconst128); -} - -float32x4_t vcvtq_n_f32_u32(uint32x4_t a, __constrange(1,32) int b); // VCVT.F32.U32 q0, q0, #32 -_NEON2SSE_INLINE float32x4_t vcvtq_n_f32_u32(uint32x4_t a, __constrange(1,32) int b) -{ - float convconst; - __m128 cconst128, af; - convconst = (float)(1. / (1 << b)); - af = vcvtq_f32_u32(a); - cconst128 = vdupq_n_f32(convconst); - return _mm_mul_ps(af,cconst128); -} - -//**************Convert between floats *********************** -//************************************************************ -float16x4_t vcvt_f16_f32(float32x4_t a); // VCVT.F16.F32 d0, q0 -//Intel SIMD doesn't support 16bits floats curently - -float32x4_t vcvt_f32_f16(float16x4_t a); // VCVT.F32.F16 q0, d0 -//Intel SIMD doesn't support 16bits floats curently, the only solution is to store 16bit floats and load as 32 bits - -//************Vector narrow integer conversion (truncation) ****************** -//**************************************************************************** -int8x8_t vmovn_s16(int16x8_t a); // VMOVN.I16 d0,q0 -_NEON2SSE_INLINE int8x8_t vmovn_s16(int16x8_t a) // VMOVN.I16 d0,q0 -{ - int8x8_t res64; - __m128i res; - _NEON2SSE_ALIGN_16 int8_t mask8_16_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11, 13, 15 }; - res = _mm_shuffle_epi8 (a, *(__m128i*) mask8_16_even_odd); //use 64 low bits only - return64(res); -} - -int16x4_t vmovn_s32(int32x4_t a); // VMOVN.I32 d0,q0 -_NEON2SSE_INLINE int16x4_t vmovn_s32(int32x4_t a) // VMOVN.I32 d0,q0 -{ - int16x4_t res64; - __m128i res; - _NEON2SSE_ALIGN_16 int8_t mask8_32_even_odd[16] = { 0,1, 4,5, 8,9, 12,13, 2,3, 6,7,10,11,14,15}; - res = _mm_shuffle_epi8 (a, *(__m128i*) mask8_32_even_odd); //use 64 low bits only - return64(res); -} - -int32x2_t vmovn_s64(int64x2_t a); // VMOVN.I64 d0,q0 -_NEON2SSE_INLINE int32x2_t vmovn_s64(int64x2_t a) -{ - //may be not effective compared with a serial implementation - int32x2_t res64; - __m128i res; - res = _mm_shuffle_epi32 (a, 0 | (2 << 2) | (1 << 4) | (3 << 6)); //use 64 low bits only, _MM_SHUFFLE(3, 1, 2, 0) - return64(res); -} - -uint8x8_t vmovn_u16(uint16x8_t a); // VMOVN.I16 d0,q0 -#define vmovn_u16 vmovn_s16 - -uint16x4_t vmovn_u32(uint32x4_t a); // VMOVN.I32 d0,q0 -#define vmovn_u32 vmovn_s32 - -uint32x2_t vmovn_u64(uint64x2_t a); // VMOVN.I64 d0,q0 -#define vmovn_u64 vmovn_s64 - -//**************** Vector long move *********************** -//*********************************************************** -int16x8_t vmovl_s8(int8x8_t a); // VMOVL.S8 q0,d0 -#define vmovl_s8(a) _MM_CVTEPI8_EPI16(_pM128i(a)) //SSE4.1 - -int32x4_t vmovl_s16(int16x4_t a); // VMOVL.S16 q0,d0 -#define vmovl_s16(a) _MM_CVTEPI16_EPI32(_pM128i(a)) //SSE4.1 - -int64x2_t vmovl_s32(int32x2_t a); // VMOVL.S32 q0,d0 -#define vmovl_s32(a) _MM_CVTEPI32_EPI64(_pM128i(a)) //SSE4.1 - -uint16x8_t vmovl_u8(uint8x8_t a); // VMOVL.U8 q0,d0 -#define vmovl_u8(a) _MM_CVTEPU8_EPI16(_pM128i(a)) //SSE4.1 - -uint32x4_t vmovl_u16(uint16x4_t a); // VMOVL.s16 q0,d0 -#define vmovl_u16(a) _MM_CVTEPU16_EPI32(_pM128i(a)) //SSE4.1 - -uint64x2_t vmovl_u32(uint32x2_t a); // VMOVL.U32 q0,d0 -#define vmovl_u32(a) _MM_CVTEPU32_EPI64(_pM128i(a)) //SSE4.1 - -//*************Vector saturating narrow integer***************** -//************************************************************** -int8x8_t vqmovn_s16(int16x8_t a); // VQMOVN.S16 d0,q0 -_NEON2SSE_INLINE int8x8_t vqmovn_s16(int16x8_t a) -{ - int8x8_t res64; - __m128i res; - res = _mm_packs_epi16(a, a); - return64(res); -} - -int16x4_t vqmovn_s32(int32x4_t a); // VQMOVN.S32 d0,q0 -_NEON2SSE_INLINE int16x4_t vqmovn_s32(int32x4_t a) -{ - int16x4_t res64; - __m128i res; - res = _mm_packs_epi32(a, a); - return64(res); -} - -int32x2_t vqmovn_s64(int64x2_t a); // VQMOVN.S64 d0,q0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vqmovn_s64(int64x2_t a),_NEON2SSE_REASON_SLOW_SERIAL) //no effective SIMD solution -{ - int32x2_t res; - _NEON2SSE_ALIGN_16 int64_t atmp[2]; - _mm_store_si128((__m128i*)atmp, a); - if(atmp[0]>SINT_MAX) atmp[0] = SINT_MAX; - if(atmp[0]<SINT_MIN) atmp[0] = SINT_MIN; - if(atmp[1]>SINT_MAX) atmp[1] = SINT_MAX; - if(atmp[1]<SINT_MIN) atmp[1] = SINT_MIN; - res.m64_i32[0] = (int32_t)atmp[0]; - res.m64_i32[1] = (int32_t)atmp[1]; - return res; -} - -uint8x8_t vqmovn_u16(uint16x8_t a); // VQMOVN.s16 d0,q0 -_NEON2SSE_INLINE uint8x8_t vqmovn_u16(uint16x8_t a) // VQMOVN.s16 d0,q0 -{ - //no uint16 to uint8 conversion in SSE, need truncate to max signed first - uint8x8_t res64; - __m128i c7fff, a_trunc; - c7fff = _mm_set1_epi16 (0x7fff); // 15-th bit set to zero - a_trunc = _mm_and_si128(a, c7fff); // a truncated to max signed - a_trunc = _mm_packus_epi16 (a_trunc, a_trunc); //use low 64bits only - return64(a_trunc); -} - -uint16x4_t vqmovn_u32(uint32x4_t a); // VQMOVN.U32 d0,q0 -_NEON2SSE_INLINE uint16x4_t vqmovn_u32(uint32x4_t a) // VQMOVN.U32 d0,q0 -{ - //no uint32 to uint16 conversion in SSE, need truncate to max signed first - uint16x4_t res64; - __m128i c7fffffff, a_trunc; - c7fffffff = _mm_set1_epi32((uint32_t)0x7fffffff); // 31-th bit set to zero - a_trunc = _mm_and_si128(a, c7fffffff); // a truncated to max signed - a_trunc = _MM_PACKUS1_EPI32 (a_trunc); //use low 64bits only - return64(a_trunc); -} - -uint32x2_t vqmovn_u64(uint64x2_t a); // VQMOVN.U64 d0,q0 -_NEON2SSE_INLINE uint32x2_t vqmovn_u64(uint64x2_t a) -{ - //serial solution may be faster - uint32x2_t res64; - __m128i res_hi, mask; - mask = _mm_setzero_si128(); - res_hi = _mm_srli_epi64(a, 32); - res_hi = _mm_cmpeq_epi32(res_hi, mask); - mask = _mm_cmpeq_epi32(mask,mask); //all fff - mask = _mm_andnot_si128(res_hi,mask); //inverst res_hi to get >32 bits numbers - res_hi = _mm_or_si128(a, mask); - res_hi = _mm_shuffle_epi32(res_hi, 0 | (2 << 2) | (1 << 4) | (3 << 6)); //shuffle the data to get 2 32-bits - return64(res_hi); -} -//************* Vector saturating narrow integer signed->unsigned ************** -//***************************************************************************** -uint8x8_t vqmovun_s16(int16x8_t a); // VQMOVUN.S16 d0,q0 -_NEON2SSE_INLINE uint8x8_t vqmovun_s16(int16x8_t a) -{ - uint8x8_t res64; - __m128i res; - res = _mm_packus_epi16(a, a); //use low 64bits only - return64(res); -} - -uint16x4_t vqmovun_s32(int32x4_t a); // VQMOVUN.S32 d0,q0 -_NEON2SSE_INLINE uint16x4_t vqmovun_s32(int32x4_t a) -{ - uint16x4_t res64; - __m128i res; - res = _MM_PACKUS1_EPI32(a); //use low 64bits only - return64(res); -} - -uint32x2_t vqmovun_s64(int64x2_t a); // VQMOVUN.S64 d0,q0 -_NEON2SSE_INLINE uint32x2_t vqmovun_s64(int64x2_t a) -{ - uint32x2_t res64; - __m128i res_hi,res_lo, zero, cmp; - zero = _mm_setzero_si128(); - res_hi = _mm_srli_epi64(a, 32); - cmp = _mm_cmpgt_epi32(zero, res_hi); //if cmp<0 the result should be zero - res_lo = _mm_andnot_si128(cmp,a); //if cmp zero - do nothing, otherwise cmp <0 and the result is 0 - cmp = _mm_cmpgt_epi32(res_hi,zero); //if cmp positive - res_lo = _mm_or_si128(res_lo, cmp); //if cmp positive we are out of 32bits need to saturaate to 0xffffffff - res_lo = _mm_shuffle_epi32(res_lo, 0 | (2 << 2) | (1 << 4) | (3 << 6)); //shuffle the data to get 2 32-bits - return64(res_lo); -} - -// ******************************************************** -// **************** Table look up ************************** -// ******************************************************** -//VTBL (Vector Table Lookup) uses byte indexes in a control vector to look up byte values -//in a table and generate a new vector. Indexes out of range return 0. -//for Intel SIMD we need to set the MSB to 1 for zero return -uint8x8_t vtbl1_u8(uint8x8_t a, uint8x8_t b); // VTBL.8 d0, {d0}, d0 -_NEON2SSE_INLINE uint8x8_t vtbl1_u8(uint8x8_t a, uint8x8_t b) -{ - uint8x8_t res64; - __m128i c7, maskgt, bmask, b128; - c7 = _mm_set1_epi8 (7); - b128 = _pM128i(b); - maskgt = _mm_cmpgt_epi8(b128,c7); - bmask = _mm_or_si128(b128,maskgt); - bmask = _mm_shuffle_epi8(_pM128i(a),bmask); - return64(bmask); -} - -int8x8_t vtbl1_s8(int8x8_t a, int8x8_t b); // VTBL.8 d0, {d0}, d0 -#define vtbl1_s8 vtbl1_u8 - -poly8x8_t vtbl1_p8(poly8x8_t a, uint8x8_t b); // VTBL.8 d0, {d0}, d0 -#define vtbl1_p8 vtbl1_u8 - -//Special trick to avoid __declspec(align('8')) won't be aligned" error -//uint8x8_t vtbl2_u8(uint8x8x2_t a, uint8x8_t b); // VTBL.8 d0, {d0, d1}, d0 -uint8x8_t vtbl2_u8_ptr(uint8x8x2_t* a, uint8x8_t b); // VTBL.8 d0, {d0, d1}, d0 -_NEON2SSE_INLINE uint8x8_t vtbl2_u8_ptr(uint8x8x2_t* a, uint8x8_t b) -{ - uint8x8_t res64; - __m128i c15, a01, maskgt15, bmask, b128; - c15 = _mm_set1_epi8 (15); - b128 = _pM128i(b); - maskgt15 = _mm_cmpgt_epi8(b128,c15); - bmask = _mm_or_si128(b128, maskgt15); - a01 = _mm_unpacklo_epi64(_pM128i(a->val[0]), _pM128i(a->val[1])); - a01 = _mm_shuffle_epi8(a01, bmask); - return64(a01); -} -#define vtbl2_u8(a, b) vtbl2_u8_ptr(&a, b) - -//int8x8_t vtbl2_s8(int8x8x2_t a, int8x8_t b); // VTBL.8 d0, {d0, d1}, d0 -#define vtbl2_s8 vtbl2_u8 - -//poly8x8_t vtbl2_p8(poly8x8x2_t a, uint8x8_t b); // VTBL.8 d0, {d0, d1}, d0 -#define vtbl2_p8 vtbl2_u8 - -//Special trick to avoid __declspec(align('16')) won't be aligned" error -//uint8x8_t vtbl3_u8(uint8x8x3_t a, uint8x8_t b); // VTBL.8 d0, {d0, d1, d2}, d0 -_NEON2SSE_INLINE uint8x8_t vtbl3_u8_ptr(uint8x8x3_t* a, uint8x8_t b) -{ - //solution may be not optimal - uint8x8_t res64; - __m128i c15, c23, maskgt23, bmask, maskgt15, sh0, sh1, a01, b128; - c15 = _mm_set1_epi8 (15); - c23 = _mm_set1_epi8 (23); - b128 = _pM128i(b); - maskgt23 = _mm_cmpgt_epi8(b128,c23); - bmask = _mm_or_si128(b128, maskgt23); - maskgt15 = _mm_cmpgt_epi8(b128,c15); - a01 = _mm_unpacklo_epi64(_pM128i(a->val[0]),_pM128i(a->val[1])); - sh0 = _mm_shuffle_epi8(a01, bmask); - sh1 = _mm_shuffle_epi8(_pM128i(a->val[2]), bmask); //for bi>15 bi is wrapped (bi-=15) - sh0 = _MM_BLENDV_EPI8(sh0, sh1, maskgt15); //SSE4.1 - return64(sh0); -} -#define vtbl3_u8(a,b) vtbl3_u8_ptr(&a,b) - -//int8x8_t vtbl3_s8(int8x8x3_t a, int8x8_t b); // VTBL.8 d0, {d0, d1, d2}, d0 -int8x8_t vtbl3_s8_ptr(int8x8x3_t* a, int8x8_t b); // VTBL.8 d0, {d0, d1, d2}, d0 -#define vtbl3_s8 vtbl3_u8 - -//poly8x8_t vtbl3_p8(poly8x8x3_t a, uint8x8_t b); // VTBL.8 d0, {d0, d1, d2}, d0 -poly8x8_t vtbl3_p8_ptr(poly8x8x3_t* a, uint8x8_t b); // VTBL.8 d0, {d0, d1, d2}, d0 -#define vtbl3_p8 vtbl3_u8 - -//uint8x8_t vtbl4_u8(uint8x8x4_t a, uint8x8_t b); // VTBL.8 d0, {d0, d1, d2, d3}, d0 -_NEON2SSE_INLINE uint8x8_t vtbl4_u8_ptr(uint8x8x4_t* a, uint8x8_t b) -{ - //solution may be not optimal - uint8x8_t res64; - __m128i c15, c31, maskgt31, bmask, maskgt15, sh0, sh1, a01, a23, b128; - c15 = _mm_set1_epi8 (15); - c31 = _mm_set1_epi8 (31); - b128 = _pM128i(b); - maskgt31 = _mm_cmpgt_epi8(b128,c31); - bmask = _mm_or_si128(b128, maskgt31); - maskgt15 = _mm_cmpgt_epi8(b128,c15); - a01 = _mm_unpacklo_epi64(_pM128i(a->val[0]),_pM128i(a->val[1])); - a23 = _mm_unpacklo_epi64(_pM128i(a->val[2]),_pM128i(a->val[3])); - sh0 = _mm_shuffle_epi8(a01, bmask); - sh1 = _mm_shuffle_epi8(a23, bmask); //for bi>15 bi is wrapped (bi-=15) - sh0 = _MM_BLENDV_EPI8 (sh0, sh1, maskgt15); //SSE4.1 - return64(sh0); -} -#define vtbl4_u8(a,b) vtbl4_u8_ptr(&a,b) - -//int8x8_t vtbl4_s8(int8x8x4_t a, int8x8_t b); // VTBL.8 d0, {d0, d1, d2, d3}, d0 -int8x8_t vtbl4_s8_ptr(int8x8x4_t* a, int8x8_t b); // VTBL.8 d0, {d0, d1, d2, d3}, d0 -#define vtbl4_s8 vtbl4_u8 - -//poly8x8_t vtbl4_p8(poly8x8x4_t a, uint8x8_t b); // VTBL.8 d0, {d0, d1, d2, d3}, d0 -poly8x8_t vtbl4_p8_ptr(poly8x8x4_t* a, uint8x8_t b); // VTBL.8 d0, {d0, d1, d2, d3}, d0 -#define vtbl4_p8 vtbl4_u8 - -//****************** Extended table look up intrinsics *************************** -//********************************************************************************** -//VTBX (Vector Table Extension) works in the same way as VTBL do, -// except that indexes out of range leave the destination element unchanged. - -uint8x8_t vtbx1_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c); // VTBX.8 d0, {d0}, d0 -_NEON2SSE_INLINE uint8x8_t vtbx1_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c) -{ - uint8x8_t res64; - __m128i c7, maskgt, sh, c128; - c7 = _mm_set1_epi8 (7); - c128 = _pM128i(c); - maskgt = _mm_cmpgt_epi8(c128,c7); - c7 = _mm_and_si128(maskgt,_pM128i(a)); - sh = _mm_shuffle_epi8(_pM128i(b),c128); - sh = _mm_andnot_si128(maskgt,sh); - sh = _mm_or_si128(sh,c7); - return64(sh); -} - -int8x8_t vtbx1_s8(int8x8_t a, int8x8_t b, int8x8_t c); // VTBX.8 d0, {d0}, d0 -#define vtbx1_s8 vtbx1_u8 - -poly8x8_t vtbx1_p8(poly8x8_t a, poly8x8_t b, uint8x8_t c); // VTBX.8 d0, {d0}, d0 -#define vtbx1_p8 vtbx1_u8 - -//Special trick to avoid __declspec(align('8')) won't be aligned" error -//uint8x8_t vtbx2_u8(uint8x8_t a, uint8x8x2_t b, uint8x8_t c); // VTBX.8 d0, {d0, d1}, d0 -uint8x8_t vtbx2_u8_ptr(uint8x8_t a, uint8x8x2_t* b, uint8x8_t c); // VTBX.8 d0, {d0, d1}, d0 -_NEON2SSE_INLINE uint8x8_t vtbx2_u8_ptr(uint8x8_t a, uint8x8x2_t* b, uint8x8_t c) -{ - uint8x8_t res64; - __m128i c15, b01, maskgt15, sh, c128; - c15 = _mm_set1_epi8 (15); - c128 = _pM128i(c); - maskgt15 = _mm_cmpgt_epi8(c128, c15); - c15 = _mm_and_si128(maskgt15, _pM128i(a)); - b01 = _mm_unpacklo_epi64(_pM128i(b->val[0]), _pM128i(b->val[1])); - sh = _mm_shuffle_epi8(b01, c128); - sh = _mm_andnot_si128(maskgt15, sh); - sh = _mm_or_si128(sh,c15); - return64(sh); -} -#define vtbx2_u8(a, b, c) vtbx2_u8_ptr(a, &b, c) - -//int8x8_t vtbx2_s8(int8x8_t a, int8x8x2_t b, int8x8_t c); // VTBX.8 d0, {d0, d1}, d0 -#define vtbx2_s8 vtbx2_u8 - -//poly8x8_t vtbx2_p8(poly8x8_t a, poly8x8x2_t b, uint8x8_t c); // VTBX.8 d0, {d0, d1}, d0 -#define vtbx2_p8 vtbx2_u8 - -//uint8x8_t vtbx3_u8(uint8x8_t a, uint8x8x3_t b, uint8x8_t c) // VTBX.8 d0, {d0, d1, d2}, d0 -_NEON2SSE_INLINE uint8x8_t vtbx3_u8_ptr(uint8x8_t a, uint8x8x3_t* b, uint8x8_t c) -{ - //solution may be not optimal - uint8x8_t res64; - __m128i c15, c23, maskgt15, maskgt23, sh0, sh1, b01, c128; - c15 = _mm_set1_epi8 (15); - c23 = _mm_set1_epi8 (23); - c128 = _pM128i(c); - maskgt15 = _mm_cmpgt_epi8(c128,c15); - maskgt23 = _mm_cmpgt_epi8(c128,c23); - c23 = _mm_and_si128(maskgt23, _pM128i(a)); - b01 = _mm_unpacklo_epi64(_pM128i(b->val[0]),_pM128i(b->val[1])); - sh0 = _mm_shuffle_epi8(b01, c128); - sh1 = _mm_shuffle_epi8(_pM128i(b->val[2]), c128); //for bi>15 bi is wrapped (bi-=15) - sh0 = _MM_BLENDV_EPI8(sh0, sh1, maskgt15); - sh0 = _mm_andnot_si128(maskgt23,sh0); - sh0 = _mm_or_si128(sh0,c23); - return64(sh0); -} -#define vtbx3_u8(a, b, c) vtbx3_u8_ptr(a, &b, c) - -//int8x8_t vtbx3_s8(int8x8_t a, int8x8x3_t b, int8x8_t c); // VTBX.8 d0, {d0, d1, d2}, d0 -int8x8_t vtbx3_s8_ptr(int8x8_t a, int8x8x3_t* b, int8x8_t c); -#define vtbx3_s8 vtbx3_u8 - -//poly8x8_t vtbx3_p8(poly8x8_t a, poly8x8x3_t b, uint8x8_t c); // VTBX.8 d0, {d0, d1, d2}, d0 -poly8x8_t vtbx3_p8_ptr(poly8x8_t a, poly8x8x3_t* b, uint8x8_t c); -#define vtbx3_p8 vtbx3_u8 - -//uint8x8_t vtbx4_u8(uint8x8_t a, uint8x8x4_t b, uint8x8_t c) // VTBX.8 d0, {d0, d1, d2, d3}, d0 -_NEON2SSE_INLINE uint8x8_t vtbx4_u8_ptr(uint8x8_t a, uint8x8x4_t* b, uint8x8_t c) -{ - //solution may be not optimal - uint8x8_t res64; - __m128i c15, c31, maskgt15, maskgt31, sh0, sh1, b01, b23, c128; - c15 = _mm_set1_epi8 (15); - c31 = _mm_set1_epi8 (31); - c128 = _pM128i(c); - maskgt15 = _mm_cmpgt_epi8(c128,c15); - maskgt31 = _mm_cmpgt_epi8(c128,c31); - c31 = _mm_and_si128(maskgt31, _pM128i(a)); - - b01 = _mm_unpacklo_epi64(_pM128i(b->val[0]),_pM128i(b->val[1])); - b23 = _mm_unpacklo_epi64(_pM128i(b->val[2]),_pM128i(b->val[3])); - sh0 = _mm_shuffle_epi8(b01, c128); - sh1 = _mm_shuffle_epi8(b23, c128); //for bi>15 bi is wrapped (bi-=15) - sh0 = _MM_BLENDV_EPI8(sh0, sh1, maskgt15); - sh0 = _mm_andnot_si128(maskgt31,sh0); - sh0 = _mm_or_si128(sh0,c31); - return64(sh0); -} -#define vtbx4_u8(a, b, c) vtbx4_u8_ptr(a, &b, c) - -//int8x8_t vtbx4_s8(int8x8_t a, int8x8x4_t b, int8x8_t c); // VTBX.8 d0, {d0, d1, d2, d3}, d0 -int8x8_t vtbx4_s8_ptr(int8x8_t a, int8x8x4_t* b, int8x8_t c); -#define vtbx4_s8 vtbx4_u8 - -//poly8x8_t vtbx4_p8(poly8x8_t a, poly8x8x4_t b, uint8x8_t c); // VTBX.8 d0, {d0, d1, d2, d3}, d0 -poly8x8_t vtbx4_p8_ptr(poly8x8_t a, poly8x8x4_t* b, uint8x8_t c); -#define vtbx4_p8 vtbx4_u8 - -//************************************************************************************************* -// *************************** Operations with a scalar value ********************************* -//************************************************************************************************* - -//******* Vector multiply accumulate by scalar ************************************************* -//********************************************************************************************** -int16x4_t vmla_lane_s16(int16x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); // VMLA.I16 d0, d0, d0[0] -_NEON2SSE_INLINE int16x4_t vmla_lane_s16(int16x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l) // VMLA.I16 d0, d0, d0[0] -{ - int16_t c; - int16x4_t scalar; - c = vget_lane_s16(v, l); - scalar = vdup_n_s16(c); - return vmla_s16(a, b, scalar); -} - -int32x2_t vmla_lane_s32(int32x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); // VMLA.I32 d0, d0, d0[0] -_NEON2SSE_INLINE int32x2_t vmla_lane_s32(int32x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l) // VMLA.I32 d0, d0, d0[0] -{ - int32_t c; - int32x2_t scalar; - c = vget_lane_s32(v, l); - scalar = vdup_n_s32(c); - return vmla_s32(a, b, scalar); -} - -uint16x4_t vmla_lane_u16(uint16x4_t a, uint16x4_t b, uint16x4_t v, __constrange(0,3) int l); // VMLA.I16 d0, d0, d0[0] -#define vmla_lane_u16 vmla_lane_s16 - - -uint32x2_t vmla_lane_u32(uint32x2_t a, uint32x2_t b, uint32x2_t v, __constrange(0,1) int l); // VMLA.I32 d0, d0, d0[0] -#define vmla_lane_u32 vmla_lane_s32 - -float32x2_t vmla_lane_f32(float32x2_t a, float32x2_t b, float32x2_t v, __constrange(0,1) int l); // VMLA.F32 d0, d0, d0[0] -_NEON2SSE_INLINE float32x2_t vmla_lane_f32(float32x2_t a, float32x2_t b, float32x2_t v, __constrange(0,1) int l) -{ - float32_t vlane; - float32x2_t c; - vlane = vget_lane_f32(v, l); - c = vdup_n_f32(vlane); - return vmla_f32(a,b,c); -} - -int16x8_t vmlaq_lane_s16(int16x8_t a, int16x8_t b, int16x4_t v, __constrange(0,3) int l); // VMLA.I16 q0, q0, d0[0] -_NEON2SSE_INLINE int16x8_t vmlaq_lane_s16(int16x8_t a, int16x8_t b, int16x4_t v, __constrange(0,3) int l) // VMLA.I16 q0, q0, d0[0] -{ - int16_t vlane; - int16x8_t c; - vlane = vget_lane_s16(v, l); - c = vdupq_n_s16(vlane); - return vmlaq_s16(a,b,c); -} - -int32x4_t vmlaq_lane_s32(int32x4_t a, int32x4_t b, int32x2_t v, __constrange(0,1) int l); // VMLA.I32 q0, q0, d0[0] -_NEON2SSE_INLINE int32x4_t vmlaq_lane_s32(int32x4_t a, int32x4_t b, int32x2_t v, __constrange(0,1) int l) // VMLA.I32 q0, q0, d0[0] -{ - int32_t vlane; - int32x4_t c; - vlane = vget_lane_s32(v, l); - c = vdupq_n_s32(vlane); - return vmlaq_s32(a,b,c); -} - -uint16x8_t vmlaq_lane_u16(uint16x8_t a, uint16x8_t b, uint16x4_t v, __constrange(0,3) int l); // VMLA.I16 q0, q0, d0[0] -#define vmlaq_lane_u16 vmlaq_lane_s16 - -uint32x4_t vmlaq_lane_u32(uint32x4_t a, uint32x4_t b, uint32x2_t v, __constrange(0,1) int l); // VMLA.I32 q0, q0, d0[0] -#define vmlaq_lane_u32 vmlaq_lane_s32 - -float32x4_t vmlaq_lane_f32(float32x4_t a, float32x4_t b, float32x2_t v, __constrange(0,1) int l); // VMLA.F32 q0, q0, d0[0] -_NEON2SSE_INLINE float32x4_t vmlaq_lane_f32(float32x4_t a, float32x4_t b, float32x2_t v, __constrange(0,1) int l) // VMLA.F32 q0, q0, d0[0] -{ - float32_t vlane; - float32x4_t c; - vlane = vget_lane_f32(v, l); - c = vdupq_n_f32(vlane); - return vmlaq_f32(a,b,c); -} - -//***************** Vector widening multiply accumulate by scalar ********************** -//*************************************************************************************** -int32x4_t vmlal_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); // VMLAL.S16 q0, d0, d0[0] -_NEON2SSE_INLINE int32x4_t vmlal_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l) // VMLAL.S16 q0, d0, d0[0] -{ - int16_t vlane; - int16x4_t c; - vlane = vget_lane_s16(v, l); - c = vdup_n_s16(vlane); - return vmlal_s16(a, b, c); -} - -int64x2_t vmlal_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); // VMLAL.S32 q0, d0, d0[0] -_NEON2SSE_INLINE int64x2_t vmlal_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l) // VMLAL.S32 q0, d0, d0[0] -{ - int32_t vlane; - int32x2_t c; - vlane = vget_lane_s32(v, l); - c = vdup_n_s32(vlane); - return vmlal_s32(a, b, c); -} - -uint32x4_t vmlal_lane_u16(uint32x4_t a, uint16x4_t b, uint16x4_t v, __constrange(0,3) int l); // VMLAL.s16 q0, d0, d0[0] -_NEON2SSE_INLINE uint32x4_t vmlal_lane_u16(uint32x4_t a, uint16x4_t b, uint16x4_t v, __constrange(0,3) int l) // VMLAL.s16 q0, d0, d0[0] -{ - uint16_t vlane; - uint16x4_t c; - vlane = vget_lane_u16(v, l); - c = vdup_n_u16(vlane); - return vmlal_u16(a, b, c); -} - -uint64x2_t vmlal_lane_u32(uint64x2_t a, uint32x2_t b, uint32x2_t v, __constrange(0,1) int l); // VMLAL.U32 q0, d0, d0[0] -_NEON2SSE_INLINE uint64x2_t vmlal_lane_u32(uint64x2_t a, uint32x2_t b, uint32x2_t v, __constrange(0,1) int l) // VMLAL.U32 q0, d0, d0[0] -{ - uint32_t vlane; - uint32x2_t c; - vlane = vget_lane_u32(v, l); - c = vdup_n_u32(vlane); - return vmlal_u32(a, b, c); -} - -// ******** Vector widening saturating doubling multiply accumulate by scalar ******************************* -// ************************************************************************************************ -int32x4_t vqdmlal_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); // VQDMLAL.S16 q0, d0, d0[0] -_NEON2SSE_INLINE int32x4_t vqdmlal_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l) -{ - int16_t vlane; - int16x4_t c; - vlane = vget_lane_s16(v, l); - c = vdup_n_s16(vlane); - return vqdmlal_s16(a, b, c); -} - -int64x2_t vqdmlal_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); // VQDMLAL.S32 q0, d0, d0[0] -_NEON2SSE_INLINE int64x2_t vqdmlal_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l) -{ - int32_t vlane; - uint32x2_t c; - vlane = vget_lane_s32(v, l); - c = vdup_n_s32(vlane); - return vqdmlal_s32(a, b, c); -} - -// ****** Vector multiply subtract by scalar ***************** -// ************************************************************* -int16x4_t vmls_lane_s16(int16x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); // VMLS.I16 d0, d0, d0[0] -_NEON2SSE_INLINE int16x4_t vmls_lane_s16(int16x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l) // VMLS.I16 d0, d0, d0[0] -{ - int16_t vlane; - int16x4_t c; - vlane = vget_lane_s16(v, l); - c = vdup_n_s16(vlane); - return vmls_s16(a, b, c); -} - -int32x2_t vmls_lane_s32(int32x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); // VMLS.I32 d0, d0, d0[0] -_NEON2SSE_INLINE int32x2_t vmls_lane_s32(int32x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l) // VMLS.I32 d0, d0, d0[0] -{ - int32_t vlane; - int32x2_t c; - vlane = vget_lane_s32(v, l); - c = vdup_n_s32(vlane); - return vmls_s32(a, b, c); -} - -uint16x4_t vmls_lane_u16(uint16x4_t a, uint16x4_t b, uint16x4_t v, __constrange(0,3) int l); // VMLS.I16 d0, d0, d0[0] -_NEON2SSE_INLINE uint16x4_t vmls_lane_u16(uint16x4_t a, uint16x4_t b, uint16x4_t v, __constrange(0,3) int l) // VMLS.I16 d0, d0, d0[0] -{ - uint16_t vlane; - uint16x4_t c; - vlane = vget_lane_s16(v, l); - c = vdup_n_s16(vlane); - return vmls_s16(a, b, c); -} - -uint32x2_t vmls_lane_u32(uint32x2_t a, uint32x2_t b, uint32x2_t v, __constrange(0,1) int l); // VMLS.I32 d0, d0, d0[0] -_NEON2SSE_INLINE uint32x2_t vmls_lane_u32(uint32x2_t a, uint32x2_t b, uint32x2_t v, __constrange(0,1) int l) // VMLS.I32 d0, d0, d0[0] -{ - uint32_t vlane; - uint32x2_t c; - vlane = vget_lane_u32(v, l); - c = vdup_n_u32(vlane); - return vmls_u32(a, b, c); -} - -float32x2_t vmls_lane_f32(float32x2_t a, float32x2_t b, float32x2_t v, __constrange(0,1) int l); // VMLS.F32 d0, d0, d0[0] -_NEON2SSE_INLINE float32x2_t vmls_lane_f32(float32x2_t a, float32x2_t b, float32x2_t v, __constrange(0,1) int l) -{ - float32_t vlane; - float32x2_t c; - vlane = (float) vget_lane_f32(v, l); - c = vdup_n_f32(vlane); - return vmls_f32(a,b,c); -} - -int16x8_t vmlsq_lane_s16(int16x8_t a, int16x8_t b, int16x4_t v, __constrange(0,3) int l); // VMLS.I16 q0, q0, d0[0] -_NEON2SSE_INLINE int16x8_t vmlsq_lane_s16(int16x8_t a, int16x8_t b, int16x4_t v, __constrange(0,3) int l) // VMLS.I16 q0, q0, d0[0] -{ - int16_t vlane; - int16x8_t c; - vlane = vget_lane_s16(v, l); - c = vdupq_n_s16(vlane); - return vmlsq_s16(a, b,c); -} - -int32x4_t vmlsq_lane_s32(int32x4_t a, int32x4_t b, int32x2_t v, __constrange(0,1) int l); // VMLS.I32 q0, q0, d0[0] -_NEON2SSE_INLINE int32x4_t vmlsq_lane_s32(int32x4_t a, int32x4_t b, int32x2_t v, __constrange(0,1) int l) // VMLS.I32 q0, q0, d0[0] -{ - int32_t vlane; - int32x4_t c; - vlane = vget_lane_s32(v, l); - c = vdupq_n_s32(vlane); - return vmlsq_s32(a,b,c); -} - -uint16x8_t vmlsq_lane_u16(uint16x8_t a, uint16x8_t b, uint16x4_t v, __constrange(0,3) int l); // VMLA.I16 q0, q0, d0[0] -_NEON2SSE_INLINE uint16x8_t vmlsq_lane_u16(uint16x8_t a, uint16x8_t b, uint16x4_t v, __constrange(0,3) int l) // VMLA.I16 q0, q0, d0[0] -{ - uint16_t vlane; - uint16x8_t c; - vlane = vget_lane_u16(v, l); - c = vdupq_n_u16(vlane); - return vmlsq_u16(a,b,c); -} - -uint32x4_t vmlsq_lane_u32(uint32x4_t a, uint32x4_t b, uint32x2_t v, __constrange(0,1) int l); // VMLA.I32 q0, q0, d0[0] -_NEON2SSE_INLINE uint32x4_t vmlsq_lane_u32(uint32x4_t a, uint32x4_t b, uint32x2_t v, __constrange(0,1) int l) // VMLA.I32 q0, q0, d0[0] -{ - uint32_t vlane; - uint32x4_t c; - vlane = vget_lane_u32(v, l); - c = vdupq_n_u32(vlane); - return vmlsq_u32(a,b,c); -} - -float32x4_t vmlsq_lane_f32(float32x4_t a, float32x4_t b, float32x2_t v, __constrange(0,1) int l); // VMLA.F32 q0, q0, d0[0] -_NEON2SSE_INLINE float32x4_t vmlsq_lane_f32(float32x4_t a, float32x4_t b, float32x2_t v, __constrange(0,1) int l) // VMLA.F32 q0, q0, d0[0] -{ - float32_t vlane; - float32x4_t c; - vlane = (float) vget_lane_f32(v, l); - c = vdupq_n_f32(vlane); - return vmlsq_f32(a,b,c); -} - -// **** Vector widening multiply subtract by scalar **** -// **************************************************** -int32x4_t vmlsl_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); // VMLAL.S16 q0, d0, d0[0] -_NEON2SSE_INLINE int32x4_t vmlsl_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l) // VMLAL.S16 q0, d0, d0[0] -{ - int16_t vlane; - int16x4_t c; - vlane = vget_lane_s16(v, l); - c = vdup_n_s16(vlane); - return vmlsl_s16(a, b, c); -} - -int64x2_t vmlsl_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); // VMLAL.S32 q0, d0, d0[0] -_NEON2SSE_INLINE int64x2_t vmlsl_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l) // VMLAL.S32 q0, d0, d0[0] -{ - int32_t vlane; - int32x2_t c; - vlane = vget_lane_s32(v, l); - c = vdup_n_s32(vlane); - return vmlsl_s32(a, b, c); -} - -uint32x4_t vmlsl_lane_u16(uint32x4_t a, uint16x4_t b, uint16x4_t v, __constrange(0,3) int l); // VMLAL.s16 q0, d0, d0[0] -_NEON2SSE_INLINE uint32x4_t vmlsl_lane_u16(uint32x4_t a, uint16x4_t b, uint16x4_t v, __constrange(0,3) int l) // VMLAL.s16 q0, d0, d0[0] -{ - uint16_t vlane; - uint16x4_t c; - vlane = vget_lane_s16(v, l); - c = vdup_n_s16(vlane); - return vmlsl_s16(a, b, c); -} - -uint64x2_t vmlsl_lane_u32(uint64x2_t a, uint32x2_t b, uint32x2_t v, __constrange(0,1) int l); // VMLAL.U32 q0, d0, d0[0] -_NEON2SSE_INLINE uint64x2_t vmlsl_lane_u32(uint64x2_t a, uint32x2_t b, uint32x2_t v, __constrange(0,1) int l) // VMLAL.U32 q0, d0, d0[0] -{ - uint32_t vlane; - uint32x2_t c; - vlane = vget_lane_u32(v, l); - c = vdup_n_u32(vlane); - return vmlsl_u32(a, b, c); -} - -//********* Vector widening saturating doubling multiply subtract by scalar ************************** -//****************************************************************************************************** -int32x4_t vqdmlsl_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l); // VQDMLSL.S16 q0, d0, d0[0] -_NEON2SSE_INLINE int32x4_t vqdmlsl_lane_s16(int32x4_t a, int16x4_t b, int16x4_t v, __constrange(0,3) int l) -{ - int16_t vlane; - int16x4_t c; - vlane = vget_lane_s16(v, l); - c = vdup_n_s16(vlane); - return vqdmlsl_s16(a, b, c); -} - -int64x2_t vqdmlsl_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l); // VQDMLSL.S32 q0, d0, d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqdmlsl_lane_s32(int64x2_t a, int32x2_t b, int32x2_t v, __constrange(0,1) int l), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int32_t vlane; - int32x2_t c; - vlane = vget_lane_s32(v, l); - c = vdup_n_s32(vlane); - return vqdmlsl_s32(a, b, c); -} -//********** Vector multiply with scalar ***************************** -int16x4_t vmul_n_s16(int16x4_t a, int16_t b); // VMUL.I16 d0,d0,d0[0] -_NEON2SSE_INLINE int16x4_t vmul_n_s16(int16x4_t a, int16_t b) // VMUL.I16 d0,d0,d0[0] -{ - int16x4_t b16x4; - b16x4 = vdup_n_s16(b); - return vmul_s16(a, b16x4); -} - -int32x2_t vmul_n_s32(int32x2_t a, int32_t b); // VMUL.I32 d0,d0,d0[0] -_NEON2SSE_INLINE int32x2_t vmul_n_s32(int32x2_t a, int32_t b) // VMUL.I32 d0,d0,d0[0] -{ - //serial solution looks faster - int32x2_t b32x2; - b32x2 = vdup_n_s32(b); - return vmul_s32(a, b32x2); -} - -float32x2_t vmul_n_f32(float32x2_t a, float32_t b); // VMUL.F32 d0,d0,d0[0] -_NEON2SSE_INLINE float32x2_t vmul_n_f32(float32x2_t a, float32_t b) // VMUL.F32 d0,d0,d0[0] -{ - float32x2_t b32x2; - b32x2 = vdup_n_f32(b); - return vmul_f32(a, b32x2); -} - -uint16x4_t vmul_n_u16(uint16x4_t a, uint16_t b); // VMUL.I16 d0,d0,d0[0] -_NEON2SSE_INLINE uint16x4_t vmul_n_u16(uint16x4_t a, uint16_t b) // VMUL.I16 d0,d0,d0[0] -{ - uint16x4_t b16x4; - b16x4 = vdup_n_s16(b); - return vmul_s16(a, b16x4); -} - -uint32x2_t vmul_n_u32(uint32x2_t a, uint32_t b); // VMUL.I32 d0,d0,d0[0] -_NEON2SSE_INLINE uint32x2_t vmul_n_u32(uint32x2_t a, uint32_t b) // VMUL.I32 d0,d0,d0[0] -{ - //serial solution looks faster - uint32x2_t b32x2; - b32x2 = vdup_n_u32(b); - return vmul_u32(a, b32x2); -} - -int16x8_t vmulq_n_s16(int16x8_t a, int16_t b); // VMUL.I16 q0,q0,d0[0] -_NEON2SSE_INLINE int16x8_t vmulq_n_s16(int16x8_t a, int16_t b) // VMUL.I16 q0,q0,d0[0] -{ - int16x8_t b16x8; - b16x8 = vdupq_n_s16(b); - return vmulq_s16(a, b16x8); -} - -int32x4_t vmulq_n_s32(int32x4_t a, int32_t b); // VMUL.I32 q0,q0,d0[0] -_NEON2SSE_INLINE int32x4_t vmulq_n_s32(int32x4_t a, int32_t b) // VMUL.I32 q0,q0,d0[0] -{ - int32x4_t b32x4; - b32x4 = vdupq_n_s32(b); - return vmulq_s32(a, b32x4); -} - -float32x4_t vmulq_n_f32(float32x4_t a, float32_t b); // VMUL.F32 q0,q0,d0[0] -_NEON2SSE_INLINE float32x4_t vmulq_n_f32(float32x4_t a, float32_t b) // VMUL.F32 q0,q0,d0[0] -{ - float32x4_t b32x4; - b32x4 = vdupq_n_f32(b); - return vmulq_f32(a, b32x4); -} - -uint16x8_t vmulq_n_u16(uint16x8_t a, uint16_t b); // VMUL.I16 q0,q0,d0[0] -_NEON2SSE_INLINE uint16x8_t vmulq_n_u16(uint16x8_t a, uint16_t b) // VMUL.I16 q0,q0,d0[0] -{ - uint16x8_t b16x8; - b16x8 = vdupq_n_s16(b); - return vmulq_s16(a, b16x8); -} - -uint32x4_t vmulq_n_u32(uint32x4_t a, uint32_t b); // VMUL.I32 q0,q0,d0[0] -_NEON2SSE_INLINE uint32x4_t vmulq_n_u32(uint32x4_t a, uint32_t b) // VMUL.I32 q0,q0,d0[0] -{ - uint32x4_t b32x4; - b32x4 = vdupq_n_u32(b); - return vmulq_u32(a, b32x4); -} - -//********** Vector multiply lane ***************************** -int16x4_t vmul_lane_s16 (int16x4_t a, int16x4_t b, __constrange(0,3) int c); -_NEON2SSE_INLINE int16x4_t vmul_lane_s16 (int16x4_t a, int16x4_t b, __constrange(0,3) int c) -{ - int16x4_t b16x4; - int16_t vlane; - vlane = vget_lane_s16(b, c); - b16x4 = vdup_n_s16(vlane); - return vmul_s16(a, b16x4); -} - -int32x2_t vmul_lane_s32 (int32x2_t a, int32x2_t b, __constrange(0,1) int c); -_NEON2SSE_INLINE int32x2_t vmul_lane_s32 (int32x2_t a, int32x2_t b, __constrange(0,1) int c) -{ - int32x2_t b32x2; - int32_t vlane; - vlane = vget_lane_s32(b, c); - b32x2 = vdup_n_s32(vlane); - return vmul_s32(a, b32x2); -} - -float32x2_t vmul_lane_f32 (float32x2_t a, float32x2_t b, __constrange(0,1) int c); -_NEON2SSE_INLINE float32x2_t vmul_lane_f32 (float32x2_t a, float32x2_t b, __constrange(0,1) int c) -{ - float32x2_t b32x2; - float32_t vlane; - vlane = vget_lane_f32(b, c); - b32x2 = vdup_n_f32(vlane); - return vmul_f32(a, b32x2); -} - -uint16x4_t vmul_lane_u16 (uint16x4_t a, uint16x4_t b, __constrange(0,3) int c); -#define vmul_lane_u16 vmul_lane_s16 - -uint32x2_t vmul_lane_u32 (uint32x2_t a, uint32x2_t b, __constrange(0,1) int c); -#define vmul_lane_u32 vmul_lane_s32 - -int16x8_t vmulq_lane_s16(int16x8_t a, int16x4_t b, __constrange(0,3) int c); -_NEON2SSE_INLINE int16x8_t vmulq_lane_s16 (int16x8_t a, int16x4_t b, __constrange(0,3) int c) -{ - int16x8_t b16x8; - int16_t vlane; - vlane = vget_lane_s16(b, c); - b16x8 = vdupq_n_s16(vlane); - return vmulq_s16(a, b16x8); -} - -int32x4_t vmulq_lane_s32 (int32x4_t a, int32x2_t b, __constrange(0,1) int c); -_NEON2SSE_INLINE int32x4_t vmulq_lane_s32 (int32x4_t a, int32x2_t b, __constrange(0,1) int c) -{ - int32x4_t b32x4; - int32_t vlane; - vlane = vget_lane_s32(b, c); - b32x4 = vdupq_n_s32(vlane); - return vmulq_s32(a, b32x4); -} - -float32x4_t vmulq_lane_f32 (float32x4_t a, float32x2_t b, __constrange(0,1) int c); -_NEON2SSE_INLINE float32x4_t vmulq_lane_f32 (float32x4_t a, float32x2_t b, __constrange(0,1) int c) -{ - float32x4_t b32x4; - float32_t vlane; - vlane = vget_lane_f32(b, c); - b32x4 = vdupq_n_f32(vlane); - return vmulq_f32(a, b32x4); -} - -uint16x8_t vmulq_lane_u16 (uint16x8_t a, uint16x4_t b, __constrange(0,3) int c); -#define vmulq_lane_u16 vmulq_lane_s16 - -uint32x4_t vmulq_lane_u32 (uint32x4_t a, uint32x2_t b, __constrange(0,1) int c); -#define vmulq_lane_u32 vmulq_lane_s32 - -//**** Vector long multiply with scalar ************ -int32x4_t vmull_n_s16(int16x4_t vec1, int16_t val2); // VMULL.S16 q0,d0,d0[0] -_NEON2SSE_INLINE int32x4_t vmull_n_s16(int16x4_t vec1, int16_t val2) // VMULL.S16 q0,d0,d0[0] -{ - int16x4_t b16x4; - b16x4 = vdup_n_s16(val2); - return vmull_s16(vec1, b16x4); -} - -int64x2_t vmull_n_s32(int32x2_t vec1, int32_t val2); // VMULL.S32 q0,d0,d0[0] -_NEON2SSE_INLINE int64x2_t vmull_n_s32(int32x2_t vec1, int32_t val2) // VMULL.S32 q0,d0,d0[0] -{ - int32x2_t b32x2; - b32x2 = vdup_n_s32(val2); - return vmull_s32(vec1, b32x2); -} - -uint32x4_t vmull_n_u16(uint16x4_t vec1, uint16_t val2); // VMULL.s16 q0,d0,d0[0] -_NEON2SSE_INLINE uint32x4_t vmull_n_u16(uint16x4_t vec1, uint16_t val2) // VMULL.s16 q0,d0,d0[0] -{ - uint16x4_t b16x4; - b16x4 = vdup_n_s16(val2); - return vmull_s16(vec1, b16x4); -} - -uint64x2_t vmull_n_u32(uint32x2_t vec1, uint32_t val2); // VMULL.U32 q0,d0,d0[0] -_NEON2SSE_INLINE uint64x2_t vmull_n_u32(uint32x2_t vec1, uint32_t val2) // VMULL.U32 q0,d0,d0[0] -{ - uint32x2_t b32x2; - b32x2 = vdup_n_u32(val2); - return vmull_u32(vec1, b32x2); -} - -//**** Vector long multiply by scalar **** -int32x4_t vmull_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VMULL.S16 q0,d0,d0[0] -_NEON2SSE_INLINE int32x4_t vmull_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3) // VMULL.S16 q0,d0,d0[0] -{ - int16_t vlane; - int16x4_t b; - vlane = vget_lane_s16(val2, val3); - b = vdup_n_s16(vlane); - return vmull_s16(vec1, b); -} - -int64x2_t vmull_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VMULL.S32 q0,d0,d0[0] -_NEON2SSE_INLINE int64x2_t vmull_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3) // VMULL.S32 q0,d0,d0[0] -{ - int32_t vlane; - int32x2_t b; - vlane = vget_lane_s32(val2, val3); - b = vdup_n_s32(vlane); - return vmull_s32(vec1, b); -} - -uint32x4_t vmull_lane_u16(uint16x4_t vec1, uint16x4_t val2, __constrange(0, 3) int val3); // VMULL.s16 q0,d0,d0[0] -_NEON2SSE_INLINE uint32x4_t vmull_lane_u16(uint16x4_t vec1, uint16x4_t val2, __constrange(0, 3) int val3) // VMULL.s16 q0,d0,d0[0] -{ - uint16_t vlane; - uint16x4_t b; - vlane = vget_lane_s16(val2, val3); - b = vdup_n_s16(vlane); - return vmull_s16(vec1, b); -} - -uint64x2_t vmull_lane_u32(uint32x2_t vec1, uint32x2_t val2, __constrange(0, 1) int val3); // VMULL.U32 q0,d0,d0[0] -_NEON2SSE_INLINE uint64x2_t vmull_lane_u32(uint32x2_t vec1, uint32x2_t val2, __constrange(0, 1) int val3) // VMULL.U32 q0,d0,d0[0] -{ - uint32_t vlane; - uint32x2_t b; - vlane = vget_lane_u32(val2, val3); - b = vdup_n_u32(vlane); - return vmull_u32(vec1, b); -} - -//********* Vector saturating doubling long multiply with scalar ******************* -int32x4_t vqdmull_n_s16(int16x4_t vec1, int16_t val2); // VQDMULL.S16 q0,d0,d0[0] -_NEON2SSE_INLINE int32x4_t vqdmull_n_s16(int16x4_t vec1, int16_t val2) -{ - //the serial soulution may be faster due to saturation - int16x4_t b; - b = vdup_n_s16(val2); - return vqdmull_s16(vec1, b); -} - -int64x2_t vqdmull_n_s32(int32x2_t vec1, int32_t val2); // VQDMULL.S32 q0,d0,d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqdmull_n_s32(int32x2_t vec1, int32_t val2), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int32x2_t b; - b = vdup_n_s32(val2); - return vqdmull_s32(vec1,b); //slow serial function!!!! -} - -//************* Vector saturating doubling long multiply by scalar *********************************************** -int32x4_t vqdmull_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VQDMULL.S16 q0,d0,d0[0] -_NEON2SSE_INLINE int32x4_t vqdmull_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3) -{ - int16_t c; - int16x4_t scalar; - c = vget_lane_s16(val2, val3); - scalar = vdup_n_s16(c); - return vqdmull_s16(vec1, scalar); -} - - -int64x2_t vqdmull_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VQDMULL.S32 q0,d0,d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqdmull_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int32_t c; - int32x2_t scalar; - c = vget_lane_s32(val2, val3); - scalar = vdup_n_s32(c); - return vqdmull_s32(vec1,scalar); //slow serial function!!!! -} - -// *****Vector saturating doubling multiply high with scalar ***** -int16x4_t vqdmulh_n_s16(int16x4_t vec1, int16_t val2); // VQDMULH.S16 d0,d0,d0[0] -_NEON2SSE_INLINE int16x4_t vqdmulh_n_s16(int16x4_t vec1, int16_t val2) -{ - int16x4_t res64; - return64(vqdmulhq_n_s16(_pM128i(vec1), val2)); -} - -int32x2_t vqdmulh_n_s32(int32x2_t vec1, int32_t val2); // VQDMULH.S32 d0,d0,d0[0] -_NEON2SSE_INLINE int32x2_t vqdmulh_n_s32(int32x2_t vec1, int32_t val2) -{ - int32x2_t res64; - return64(vqdmulhq_n_s32(_pM128i(vec1), val2)); -} - -int16x8_t vqdmulhq_n_s16(int16x8_t vec1, int16_t val2); // VQDMULH.S16 q0,q0,d0[0] -_NEON2SSE_INLINE int16x8_t vqdmulhq_n_s16(int16x8_t vec1, int16_t val2) // VQDMULH.S16 q0,q0,d0[0] -{ - //solution may be not optimal - int16x8_t scalar; - scalar = vdupq_n_s16(val2); - return vqdmulhq_s16(vec1, scalar); -} - -int32x4_t vqdmulhq_n_s32(int32x4_t vec1, int32_t val2); // VQDMULH.S32 q0,q0,d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x4_t vqdmulhq_n_s32(int32x4_t vec1, int32_t val2), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - int32x4_t scalar; - scalar = vdupq_n_s32(val2); - return vqdmulhq_s32(vec1, scalar); -} - -//***** Vector saturating doubling multiply high by scalar **************** -int16x4_t vqdmulh_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VQDMULH.S16 d0,d0,d0[0] -_NEON2SSE_INLINE int16x4_t vqdmulh_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3) // VQDMULH.S16 d0,d0,d0[0] -{ - //solution may be not optimal - int16_t vlane; - int16x4_t scalar; - vlane = vget_lane_s16(val2, val3); - scalar = vdup_n_s16(vlane); - return vqdmulh_s16(vec1, scalar); -} - -int32x2_t vqdmulh_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VQDMULH.S32 d0,d0,d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vqdmulh_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - int32_t vlane; - int32x2_t scalar; - vlane = vget_lane_s32(val2, val3); - scalar = vdup_n_s32(vlane); - return vqdmulh_s32(vec1, scalar); -} - -int16x8_t vqdmulhq_lane_s16(int16x8_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VQDMULH.S16 q0,q0,d0[0] -_NEON2SSE_INLINE int16x8_t vqdmulhq_lane_s16(int16x8_t vec1, int16x4_t val2, __constrange(0, 3) int val3) // VQDMULH.S16 q0,q0,d0[0] -{ - //solution may be not optimal - int16_t vlane; - int16x8_t scalar; - vlane = vget_lane_s16(val2, val3); - scalar = vdupq_n_s16(vlane ); - return vqdmulhq_s16(vec1, scalar); -} - -int32x4_t vqdmulhq_lane_s32(int32x4_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VQDMULH.S32 q0,q0,d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x4_t vqdmulhq_lane_s32(int32x4_t vec1, int32x2_t val2, __constrange(0, 1) int val3), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - //solution may be not optimal - int32_t vlane; - int32x4_t scalar; - vlane = vgetq_lane_s32(_pM128i(val2), val3); - scalar = vdupq_n_s32(vlane ); - return vqdmulhq_s32(vec1, scalar); -} - -//******** Vector saturating rounding doubling multiply high with scalar *** -int16x4_t vqrdmulh_n_s16(int16x4_t vec1, int16_t val2); // VQRDMULH.S16 d0,d0,d0[0] -_NEON2SSE_INLINE int16x4_t vqrdmulh_n_s16(int16x4_t vec1, int16_t val2) // VQRDMULH.S16 d0,d0,d0[0] -{ - //solution may be not optimal - int16x4_t scalar; - scalar = vdup_n_s16(val2); - return vqrdmulh_s16(vec1, scalar); -} - -int32x2_t vqrdmulh_n_s32(int32x2_t vec1, int32_t val2); // VQRDMULH.S32 d0,d0,d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vqrdmulh_n_s32(int32x2_t vec1, int32_t val2), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - int32x2_t scalar; - scalar = vdup_n_s32(val2); - return vqrdmulh_s32(vec1, scalar); -} - -int16x8_t vqrdmulhq_n_s16(int16x8_t vec1, int16_t val2); // VQRDMULH.S16 q0,q0,d0[0] -_NEON2SSE_INLINE int16x8_t vqrdmulhq_n_s16(int16x8_t vec1, int16_t val2) // VQRDMULH.S16 q0,q0,d0[0] -{ - //solution may be not optimal - int16x8_t scalar; - scalar = vdupq_n_s16(val2); - return vqrdmulhq_s16(vec1, scalar); -} - -int32x4_t vqrdmulhq_n_s32(int32x4_t vec1, int32_t val2); // VQRDMULH.S32 q0,q0,d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x4_t vqrdmulhq_n_s32(int32x4_t vec1, int32_t val2), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - int32x4_t scalar; - scalar = vdupq_n_s32(val2); - return vqrdmulhq_s32(vec1, scalar); -} - -//********* Vector rounding saturating doubling multiply high by scalar **** -int16x4_t vqrdmulh_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VQRDMULH.S16 d0,d0,d0[0] -_NEON2SSE_INLINE int16x4_t vqrdmulh_lane_s16(int16x4_t vec1, int16x4_t val2, __constrange(0, 3) int val3) // VQRDMULH.S16 d0,d0,d0[0] -{ - //solution may be not optimal - int16_t vlane; - int16x4_t scalar; - vlane = vget_lane_s16(val2, val3); - scalar = vdup_n_s16(vlane); - return vqrdmulh_s16(vec1, scalar); -} - -int32x2_t vqrdmulh_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VQRDMULH.S32 d0,d0,d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vqrdmulh_lane_s32(int32x2_t vec1, int32x2_t val2, __constrange(0, 1) int val3), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - int32_t vlane; - int32x2_t scalar; - vlane = vget_lane_s32(val2, val3); - scalar = vdup_n_s32(vlane); - return vqrdmulh_s32(vec1, scalar); -} - -int16x8_t vqrdmulhq_lane_s16(int16x8_t vec1, int16x4_t val2, __constrange(0, 3) int val3); // VQRDMULH.S16 q0,q0,d0[0] -_NEON2SSE_INLINE int16x8_t vqrdmulhq_lane_s16(int16x8_t vec1, int16x4_t val2, __constrange(0, 3) int val3) // VQRDMULH.S16 q0,q0,d0[0] -{ - //solution may be not optimal - int16_t vlane; - int16x8_t scalar; - vlane = vget_lane_s16(val2, val3); - scalar = vdupq_n_s16(vlane); - return vqrdmulhq_s16(vec1, scalar); -} - -int32x4_t vqrdmulhq_lane_s32(int32x4_t vec1, int32x2_t val2, __constrange(0, 1) int val3); // VQRDMULH.S32 q0,q0,d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x4_t vqrdmulhq_lane_s32(int32x4_t vec1, int32x2_t val2, __constrange(0, 1) int val3), _NEON2SSE_REASON_SLOW_UNEFFECTIVE) -{ - //solution may be not optimal - int32_t vlane; - int32x4_t scalar; - vlane = vgetq_lane_s32(_pM128i(val2), val3); - scalar = vdupq_n_s32(vlane ); - return vqrdmulhq_s32(vec1, scalar); -} - -//**************Vector multiply accumulate with scalar ******************* -int16x4_t vmla_n_s16(int16x4_t a, int16x4_t b, int16_t c); // VMLA.I16 d0, d0, d0[0] -_NEON2SSE_INLINE int16x4_t vmla_n_s16(int16x4_t a, int16x4_t b, int16_t c) // VMLA.I16 d0, d0, d0[0] -{ - int16x4_t scalar; - scalar = vdup_n_s16(c); - return vmla_s16(a, b, scalar); -} - -int32x2_t vmla_n_s32(int32x2_t a, int32x2_t b, int32_t c); // VMLA.I32 d0, d0, d0[0] -_NEON2SSE_INLINE int32x2_t vmla_n_s32(int32x2_t a, int32x2_t b, int32_t c) // VMLA.I32 d0, d0, d0[0] -{ - int32x2_t scalar; - scalar = vdup_n_s32(c); - return vmla_s32(a, b, scalar); -} - -uint16x4_t vmla_n_u16(uint16x4_t a, uint16x4_t b, uint16_t c); // VMLA.I16 d0, d0, d0[0] -#define vmla_n_u16 vmla_n_s16 - - -uint32x2_t vmla_n_u32(uint32x2_t a, uint32x2_t b, uint32_t c); // VMLA.I32 d0, d0, d0[0] -#define vmla_n_u32 vmla_n_s32 - - -float32x2_t vmla_n_f32(float32x2_t a, float32x2_t b, float32_t c); // VMLA.F32 d0, d0, d0[0] -_NEON2SSE_INLINE float32x2_t vmla_n_f32(float32x2_t a, float32x2_t b, float32_t c) // VMLA.F32 d0, d0, d0[0] -{ - float32x2_t scalar; - scalar = vdup_n_f32(c); - return vmla_f32(a, b, scalar); -} - -int16x8_t vmlaq_n_s16(int16x8_t a, int16x8_t b, int16_t c); // VMLA.I16 q0, q0, d0[0] -_NEON2SSE_INLINE int16x8_t vmlaq_n_s16(int16x8_t a, int16x8_t b, int16_t c) // VMLA.I16 q0, q0, d0[0] -{ - int16x8_t scalar; - scalar = vdupq_n_s16(c); - return vmlaq_s16(a,b,scalar); -} - -int32x4_t vmlaq_n_s32(int32x4_t a, int32x4_t b, int32_t c); // VMLA.I32 q0, q0, d0[0] -_NEON2SSE_INLINE int32x4_t vmlaq_n_s32(int32x4_t a, int32x4_t b, int32_t c) // VMLA.I32 q0, q0, d0[0] -{ - int32x4_t scalar; - scalar = vdupq_n_s32(c); - return vmlaq_s32(a,b,scalar); -} - -uint16x8_t vmlaq_n_u16(uint16x8_t a, uint16x8_t b, uint16_t c); // VMLA.I16 q0, q0, d0[0] -#define vmlaq_n_u16 vmlaq_n_s16 - -uint32x4_t vmlaq_n_u32(uint32x4_t a, uint32x4_t b, uint32_t c); // VMLA.I32 q0, q0, d0[0] -#define vmlaq_n_u32 vmlaq_n_s32 - -float32x4_t vmlaq_n_f32(float32x4_t a, float32x4_t b, float32_t c); // VMLA.F32 q0, q0, d0[0] -_NEON2SSE_INLINE float32x4_t vmlaq_n_f32(float32x4_t a, float32x4_t b, float32_t c) // VMLA.F32 q0, q0, d0[0] -{ - float32x4_t scalar; - scalar = vdupq_n_f32(c); - return vmlaq_f32(a,b,scalar); -} - -//************Vector widening multiply accumulate with scalar**************************** -int32x4_t vmlal_n_s16(int32x4_t a, int16x4_t b, int16_t c); // VMLAL.S16 q0, d0, d0[0] -_NEON2SSE_INLINE int32x4_t vmlal_n_s16(int32x4_t a, int16x4_t b, int16_t c) // VMLAL.S16 q0, d0, d0[0] -{ - int16x4_t vc; - vc = vdup_n_s16(c); - return vmlal_s16(a, b, vc); -} - -int64x2_t vmlal_n_s32(int64x2_t a, int32x2_t b, int32_t c); // VMLAL.S32 q0, d0, d0[0] -_NEON2SSE_INLINE int64x2_t vmlal_n_s32(int64x2_t a, int32x2_t b, int32_t c) // VMLAL.S32 q0, d0, d0[0] -{ - int32x2_t vc; - vc = vdup_n_s32(c); - return vmlal_s32(a, b, vc); -} - -uint32x4_t vmlal_n_u16(uint32x4_t a, uint16x4_t b, uint16_t c); // VMLAL.s16 q0, d0, d0[0] -_NEON2SSE_INLINE uint32x4_t vmlal_n_u16(uint32x4_t a, uint16x4_t b, uint16_t c) // VMLAL.s16 q0, d0, d0[0] -{ - uint16x4_t vc; - vc = vdup_n_s16(c); - return vmlal_s16(a, b, vc); -} - -uint64x2_t vmlal_n_u32(uint64x2_t a, uint32x2_t b, uint32_t c); // VMLAL.U32 q0, d0, d0[0] -_NEON2SSE_INLINE uint64x2_t vmlal_n_u32(uint64x2_t a, uint32x2_t b, uint32_t c) // VMLAL.U32 q0, d0, d0[0] -{ - uint32x2_t vc; - vc = vdup_n_u32(c); - return vmlal_u32(a, b, vc); -} - -//************ Vector widening saturating doubling multiply accumulate with scalar ************** -int32x4_t vqdmlal_n_s16(int32x4_t a, int16x4_t b, int16_t c); // VQDMLAL.S16 q0, d0, d0[0] -_NEON2SSE_INLINE int32x4_t vqdmlal_n_s16(int32x4_t a, int16x4_t b, int16_t c) -{ - //not optimal SIMD soulution, serial may be faster - int16x4_t vc; - vc = vdup_n_s16(c); - return vqdmlal_s16(a, b, vc); -} - -int64x2_t vqdmlal_n_s32(int64x2_t a, int32x2_t b, int32_t c); // VQDMLAL.S32 q0, d0, d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqdmlal_n_s32(int64x2_t a, int32x2_t b, int32_t c), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int32x2_t vc; - vc = vdup_n_s32(c); - return vqdmlal_s32(a, b, vc); -} - -//******** Vector multiply subtract with scalar ************** -int16x4_t vmls_n_s16(int16x4_t a, int16x4_t b, int16_t c); // VMLS.I16 d0, d0, d0[0] -_NEON2SSE_INLINE int16x4_t vmls_n_s16(int16x4_t a, int16x4_t b, int16_t c) // VMLS.I16 d0, d0, d0[0] -{ - int16x4_t vc; - vc = vdup_n_s16(c); - return vmls_s16(a, b, vc); -} - -int32x2_t vmls_n_s32(int32x2_t a, int32x2_t b, int32_t c); // VMLS.I32 d0, d0, d0[0] -_NEON2SSE_INLINE int32x2_t vmls_n_s32(int32x2_t a, int32x2_t b, int32_t c) // VMLS.I32 d0, d0, d0[0] -{ - int32x2_t vc; - vc = vdup_n_s32(c); - return vmls_s32(a, b, vc); -} - -uint16x4_t vmls_n_u16(uint16x4_t a, uint16x4_t b, uint16_t c); // VMLS.I16 d0, d0, d0[0] -_NEON2SSE_INLINE uint16x4_t vmls_n_u16(uint16x4_t a, uint16x4_t b, uint16_t c) // VMLS.I16 d0, d0, d0[0] -{ - uint16x4_t vc; - vc = vdup_n_s16(c); - return vmls_s16(a, b, vc); -} - -uint32x2_t vmls_n_u32(uint32x2_t a, uint32x2_t b, uint32_t c); // VMLS.I32 d0, d0, d0[0] -_NEON2SSE_INLINE uint32x2_t vmls_n_u32(uint32x2_t a, uint32x2_t b, uint32_t c) // VMLS.I32 d0, d0, d0[0] -{ - uint32x2_t vc; - vc = vdup_n_u32(c); - return vmls_u32(a, b, vc); -} - -float32x2_t vmls_n_f32(float32x2_t a, float32x2_t b, float32_t c); // VMLS.F32 d0, d0, d0[0] -_NEON2SSE_INLINE float32x2_t vmls_n_f32(float32x2_t a, float32x2_t b, float32_t c) -{ - float32x2_t res; - res.m64_f32[0] = a.m64_f32[0] - b.m64_f32[0] * c; - res.m64_f32[1] = a.m64_f32[1] - b.m64_f32[1] * c; - return res; -} - -int16x8_t vmlsq_n_s16(int16x8_t a, int16x8_t b, int16_t c); // VMLS.I16 q0, q0, d0[0] -_NEON2SSE_INLINE int16x8_t vmlsq_n_s16(int16x8_t a, int16x8_t b, int16_t c) // VMLS.I16 q0, q0, d0[0] -{ - int16x8_t vc; - vc = vdupq_n_s16(c); - return vmlsq_s16(a, b,vc); -} - -int32x4_t vmlsq_n_s32(int32x4_t a, int32x4_t b, int32_t c); // VMLS.I32 q0, q0, d0[0] -_NEON2SSE_INLINE int32x4_t vmlsq_n_s32(int32x4_t a, int32x4_t b, int32_t c) // VMLS.I32 q0, q0, d0[0] -{ - int32x4_t vc; - vc = vdupq_n_s32(c); - return vmlsq_s32(a,b,vc); -} - -uint16x8_t vmlsq_n_u16(uint16x8_t a, uint16x8_t b, uint16_t c); // VMLS.I16 q0, q0, d0[0] -_NEON2SSE_INLINE uint16x8_t vmlsq_n_u16(uint16x8_t a, uint16x8_t b, uint16_t c) // VMLS.I16 q0, q0, d0[0] -{ - uint32x4_t vc; - vc = vdupq_n_u32(c); - return vmlsq_u32(a,b,vc); -} - -uint32x4_t vmlsq_n_u32(uint32x4_t a, uint32x4_t b, uint32_t c); // VMLS.I32 q0, q0, d0[0] -_NEON2SSE_INLINE uint32x4_t vmlsq_n_u32(uint32x4_t a, uint32x4_t b, uint32_t c) // VMLS.I32 q0, q0, d0[0] -{ - uint32x4_t vc; - vc = vdupq_n_u32(c); - return vmlsq_u32(a,b,vc); -} - -float32x4_t vmlsq_n_f32(float32x4_t a, float32x4_t b, float32_t c); // VMLS.F32 q0, q0, d0[0] -_NEON2SSE_INLINE float32x4_t vmlsq_n_f32(float32x4_t a, float32x4_t b, float32_t c) -{ - float32x4_t vc; - vc = vdupq_n_f32(c); - return vmlsq_f32(a,b,vc); -} - -//**** Vector widening multiply subtract with scalar ****** -int32x4_t vmlsl_n_s16(int32x4_t a, int16x4_t b, int16_t c); // VMLSL.S16 q0, d0, d0[0] -_NEON2SSE_INLINE int32x4_t vmlsl_n_s16(int32x4_t a, int16x4_t b, int16_t c) // VMLSL.S16 q0, d0, d0[0] -{ - int16x4_t vc; - vc = vdup_n_s16(c); - return vmlsl_s16(a, b, vc); -} - -int64x2_t vmlsl_n_s32(int64x2_t a, int32x2_t b, int32_t c); // VMLSL.S32 q0, d0, d0[0] -_NEON2SSE_INLINE int64x2_t vmlsl_n_s32(int64x2_t a, int32x2_t b, int32_t c) // VMLSL.S32 q0, d0, d0[0] -{ - int32x2_t vc; - vc = vdup_n_s32(c); - return vmlsl_s32(a, b, vc); -} - -uint32x4_t vmlsl_n_u16(uint32x4_t a, uint16x4_t b, uint16_t c); // VMLSL.s16 q0, d0, d0[0] -_NEON2SSE_INLINE uint32x4_t vmlsl_n_u16(uint32x4_t a, uint16x4_t b, uint16_t c) // VMLSL.s16 q0, d0, d0[0] -{ - uint16x4_t vc; - vc = vdup_n_u16(c); - return vmlsl_u16(a, b, vc); -} - -uint64x2_t vmlsl_n_u32(uint64x2_t a, uint32x2_t b, uint32_t c); // VMLSL.U32 q0, d0, d0[0] -_NEON2SSE_INLINE uint64x2_t vmlsl_n_u32(uint64x2_t a, uint32x2_t b, uint32_t c) // VMLSL.U32 q0, d0, d0[0] -{ - uint32x2_t vc; - vc = vdup_n_u32(c); - return vmlsl_u32(a, b, vc); -} - -//***** Vector widening saturating doubling multiply subtract with scalar ********* -//********************************************************************************** -int32x4_t vqdmlsl_n_s16(int32x4_t a, int16x4_t b, int16_t c); // VQDMLSL.S16 q0, d0, d0[0] -_NEON2SSE_INLINE int32x4_t vqdmlsl_n_s16(int32x4_t a, int16x4_t b, int16_t c) -{ - int16x4_t vc; - vc = vdup_n_s16(c); - return vqdmlsl_s16(a, b, vc); -} - -int64x2_t vqdmlsl_n_s32(int64x2_t a, int32x2_t b, int32_t c); // VQDMLSL.S32 q0, d0, d0[0] -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int64x2_t vqdmlsl_n_s32(int64x2_t a, int32x2_t b, int32_t c), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int32x2_t vc; - vc = vdup_n_s32(c); - return vqdmlsl_s32(a, b, vc); -} - -//******************* Vector extract *********************************************** -//************************************************************************************* -//VEXT (Vector Extract) extracts elements from the bottom end of the second operand -//vector and the top end of the first, concatenates them, and places the result in the destination vector -//c elements from the bottom end of the second operand and (8-c) from the top end of the first -int8x8_t vext_s8(int8x8_t a, int8x8_t b, __constrange(0,7) int c); // VEXT.8 d0,d0,d0,#0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int8x8_t vext_s8(int8x8_t a, int8x8_t b, __constrange(0,7) int c),_NEON2SSE_REASON_SLOW_SERIAL) -{ - int8x8_t res; - int i; - for (i = 0; i<8 - c; i++) { - res.m64_i8[i] = a.m64_i8[i + c]; - } - for(i = 0; i<c; i++) { - res.m64_i8[8 - c + i] = b.m64_i8[i]; - } - return res; -} - -uint8x8_t vext_u8(uint8x8_t a, uint8x8_t b, __constrange(0,7) int c); // VEXT.8 d0,d0,d0,#0 -#define vext_u8 vext_s8 -//same result tested - -poly8x8_t vext_p8(poly8x8_t a, poly8x8_t b, __constrange(0,7) int c); // VEXT.8 d0,d0,d0,#0 -#define vext_p8 vext_u8 - -int16x4_t vext_s16(int16x4_t a, int16x4_t b, __constrange(0,3) int c); // VEXT.16 d0,d0,d0,#0 -_NEON2SSE_INLINE int16x4_t _NEON2SSE_PERFORMANCE_WARNING (vext_s16(int16x4_t a, int16x4_t b, __constrange(0,3) int c), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int16x4_t res; - int i; - for (i = 0; i<4 - c; i++) { - res.m64_i16[i] = a.m64_i16[i + c]; - } - for(i = 0; i<c; i++) { - res.m64_i16[4 - c + i] = b.m64_i16[i]; - } - return res; -} - -uint16x4_t vext_u16(uint16x4_t a, uint16x4_t b, __constrange(0,3) int c); // VEXT.16 d0,d0,d0,#0 -#define vext_u16 vext_s16 - -poly16x4_t vext_p16(poly16x4_t a, poly16x4_t b, __constrange(0,3) int c); // VEXT.16 d0,d0,d0,#0 -#define vext_p16 vext_s16 - -int32x2_t vext_s32(int32x2_t a, int32x2_t b, __constrange(0,1) int c); // VEXT.32 d0,d0,d0,#0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(int32x2_t vext_s32(int32x2_t a, int32x2_t b, __constrange(0,1) int c), _NEON2SSE_REASON_SLOW_SERIAL) -{ - int32x2_t res; - if (c==0) { - res.m64_i32[0] = a.m64_i32[0]; - res.m64_i32[1] = a.m64_i32[1]; - } else { - res.m64_i32[0] = a.m64_i32[1]; - res.m64_i32[1] = b.m64_i32[0]; - } - return res; -} - -float32x2_t vext_f32(float32x2_t a, float32x2_t b, __constrange(0,1) int c); // VEXT.32 d0,d0,d0,#0 -_NEON2SSE_INLINE _NEON2SSE_PERFORMANCE_WARNING(float32x2_t vext_f32(float32x2_t a, float32x2_t b, __constrange(0,1) int c), _NEON2SSE_REASON_SLOW_SERIAL) -{ - float32x2_t res; - if (c==0) { - res.m64_f32[0] = a.m64_f32[0]; - res.m64_f32[1] = a.m64_f32[1]; - } else { - res.m64_f32[0] = a.m64_f32[1]; - res.m64_f32[1] = b.m64_f32[0]; - } - return res; -} - -uint32x2_t vext_u32(uint32x2_t a, uint32x2_t b, __constrange(0,1) int c); // VEXT.32 d0,d0,d0,#0 -#define vext_u32 vext_s32 - - -int64x1_t vext_s64(int64x1_t a, int64x1_t b, __constrange(0,0) int c); // VEXT.64 d0,d0,d0,#0 -#define vext_s64(a,b,c) a - -uint64x1_t vext_u64(uint64x1_t a, uint64x1_t b, __constrange(0,0) int c); // VEXT.64 d0,d0,d0,#0 -#define vext_u64(a,b,c) a - -int8x16_t vextq_s8(int8x16_t a, int8x16_t b, __constrange(0,15) int c); // VEXT.8 q0,q0,q0,#0 -#define vextq_s8(a,b,c) _MM_ALIGNR_EPI8 (b,a,c) - -uint8x16_t vextq_u8(uint8x16_t a, uint8x16_t b, __constrange(0,15) int c); // VEXT.8 q0,q0,q0,#0 -#define vextq_u8(a,b,c) _MM_ALIGNR_EPI8 (b,a,c) - -poly8x16_t vextq_p8(poly8x16_t a, poly8x16_t b, __constrange(0,15) int c); // VEXT.8 q0,q0,q0,#0 -#define vextq_p8 vextq_s8 - -int16x8_t vextq_s16(int16x8_t a, int16x8_t b, __constrange(0,7) int c); // VEXT.16 q0,q0,q0,#0 -#define vextq_s16(a,b,c) _MM_ALIGNR_EPI8 (b,a,c * 2) - -uint16x8_t vextq_u16(uint16x8_t a, uint16x8_t b, __constrange(0,7) int c); // VEXT.16 q0,q0,q0,#0 -#define vextq_u16(a,b,c) _MM_ALIGNR_EPI8 (b,a,c * 2) - -poly16x8_t vextq_p16(poly16x8_t a, poly16x8_t b, __constrange(0,7) int c); // VEXT.16 q0,q0,q0,#0 -#define vextq_p16 vextq_s16 - -int32x4_t vextq_s32(int32x4_t a, int32x4_t b, __constrange(0,3) int c); // VEXT.32 q0,q0,q0,#0 -#define vextq_s32(a,b,c) _MM_ALIGNR_EPI8 (b,a,c * 4) - -uint32x4_t vextq_u32(uint32x4_t a, uint32x4_t b, __constrange(0,3) int c); // VEXT.32 q0,q0,q0,#0 -#define vextq_u32(a,b,c) _MM_ALIGNR_EPI8 (b,a,c * 4) - -float32x4_t vextq_f32(float32x4_t a, float32x4_t b, __constrange(0,3) float c); // VEXT.32 q0,q0,q0,#0 -#define vextq_f32(a,b,c) _M128(vextq_s32(_M128i(a),_M128i(b),c) ) - -int64x2_t vextq_s64(int64x2_t a, int64x2_t b, __constrange(0,1) int c); // VEXT.64 q0,q0,q0,#0 -#define vextq_s64(a,b,c) _MM_ALIGNR_EPI8(b,a,c * 8) - -uint64x2_t vextq_u64(uint64x2_t a, uint64x2_t b, __constrange(0,1) int c); // VEXT.64 q0,q0,q0,#0 -#define vextq_u64(a,b,c) _MM_ALIGNR_EPI8(b,a,c * 8) - -//************ Reverse vector elements (swap endianness)***************** -//************************************************************************* -//VREVn.m reverses the order of the m-bit lanes within a set that is n bits wide. -int8x8_t vrev64_s8(int8x8_t vec); // VREV64.8 d0,d0 -_NEON2SSE_INLINE int8x8_t vrev64_s8(int8x8_t vec) -{ - int8x8_t res64; - __m128i res; - res = vrev64q_s8(_pM128i(vec)); - return64(res); -} - -int16x4_t vrev64_s16(int16x4_t vec); // VREV64.16 d0,d0 -_NEON2SSE_INLINE int16x4_t vrev64_s16(int16x4_t vec) -{ - int16x4_t res64; - __m128i res; - res = vrev64q_s16(_pM128i(vec)); - return64(res); -} - -int32x2_t vrev64_s32(int32x2_t vec); // VREV64.32 d0,d0 -_NEON2SSE_INLINE int32x2_t vrev64_s32(int32x2_t vec) -{ - int32x2_t res; - res.m64_i32[0] = vec.m64_i32[1]; - res.m64_i32[1] = vec.m64_i32[0]; - return res; -} - -uint8x8_t vrev64_u8(uint8x8_t vec); // VREV64.8 d0,d0 -#define vrev64_u8 vrev64_s8 - -uint16x4_t vrev64_u16(uint16x4_t vec); // VREV64.16 d0,d0 -#define vrev64_u16 vrev64_s16 - -uint32x2_t vrev64_u32(uint32x2_t vec); // VREV64.32 d0,d0 -#define vrev64_u32 vrev64_s32 - -poly8x8_t vrev64_p8(poly8x8_t vec); // VREV64.8 d0,d0 -#define vrev64_p8 vrev64_u8 - -poly16x4_t vrev64_p16(poly16x4_t vec); // VREV64.16 d0,d0 -#define vrev64_p16 vrev64_u16 - -float32x2_t vrev64_f32(float32x2_t vec); // VREV64.32 d0,d0 -_NEON2SSE_INLINE float32x2_t vrev64_f32(float32x2_t vec) -{ - float32x2_t res; - res.m64_f32[0] = vec.m64_f32[1]; - res.m64_f32[1] = vec.m64_f32[0]; - return res; -} - -int8x16_t vrev64q_s8(int8x16_t vec); // VREV64.8 q0,q0 -_NEON2SSE_INLINE int8x16_t vrev64q_s8(int8x16_t vec) // VREV64.8 q0,q0 -{ - _NEON2SSE_ALIGN_16 int8_t mask_rev_e8[16] = {7,6,5,4,3,2,1,0, 15,14,13,12,11,10,9, 8}; - return _mm_shuffle_epi8 (vec, *(__m128i*) mask_rev_e8); -} - -int16x8_t vrev64q_s16(int16x8_t vec); // VREV64.16 q0,q0 -_NEON2SSE_INLINE int16x8_t vrev64q_s16(int16x8_t vec) // VREV64.16 q0,q0 -{ - //no _mm_shuffle_epi16, _mm_shuffle_epi8 to be used with the corresponding mask - _NEON2SSE_ALIGN_16 int8_t mask_rev_e16[16] = {6,7, 4,5,2,3,0,1,14,15,12,13,10,11,8,9}; - return _mm_shuffle_epi8 (vec, *(__m128i*)mask_rev_e16); -} - -int32x4_t vrev64q_s32(int32x4_t vec); // VREV64.32 q0,q0 -_NEON2SSE_INLINE int32x4_t vrev64q_s32(int32x4_t vec) // VREV64.32 q0,q0 -{ - return _mm_shuffle_epi32 (vec, 1 | (0 << 2) | (3 << 4) | (2 << 6) ); -} - -uint8x16_t vrev64q_u8(uint8x16_t vec); // VREV64.8 q0,q0 -#define vrev64q_u8 vrev64q_s8 - -uint16x8_t vrev64q_u16(uint16x8_t vec); // VREV64.16 q0,q0 -#define vrev64q_u16 vrev64q_s16 - -uint32x4_t vrev64q_u32(uint32x4_t vec); // VREV64.32 q0,q0 -#define vrev64q_u32 vrev64q_s32 - -poly8x16_t vrev64q_p8(poly8x16_t vec); // VREV64.8 q0,q0 -#define vrev64q_p8 vrev64q_u8 - -poly16x8_t vrev64q_p16(poly16x8_t vec); // VREV64.16 q0,q0 -#define vrev64q_p16 vrev64q_u16 - -float32x4_t vrev64q_f32(float32x4_t vec); // VREV64.32 q0,q0 -#define vrev64q_f32(vec) _mm_shuffle_ps (vec, vec, _MM_SHUFFLE(2,3, 0,1)) - -//******************** 32 bit shuffles ********************** -//************************************************************ -int8x8_t vrev32_s8(int8x8_t vec); // VREV32.8 d0,d0 -_NEON2SSE_INLINE int8x8_t vrev32_s8(int8x8_t vec) -{ - int8x8_t res64; - __m128i res; - res = vrev32q_s8(_pM128i(vec)); - return64(res); -} - -int16x4_t vrev32_s16(int16x4_t vec); // VREV32.16 d0,d0 -_NEON2SSE_INLINE int16x4_t vrev32_s16(int16x4_t vec) -{ - int16x4_t res64; - __m128i res; - res = vrev32q_s16(_pM128i(vec)); - return64(res); -} - -uint8x8_t vrev32_u8(uint8x8_t vec); // VREV32.8 d0,d0 -#define vrev32_u8 vrev32_s8 - -uint16x4_t vrev32_u16(uint16x4_t vec); // VREV32.16 d0,d0 -#define vrev32_u16 vrev32_s16 - -poly8x8_t vrev32_p8(poly8x8_t vec); // VREV32.8 d0,d0 -#define vrev32_p8 vrev32_u8 - -poly16x4_t vrev32_p16(poly16x4_t vec); // VREV32.16 d0,d0 -#define vrev32_p16 vrev32_u16 - -int8x16_t vrev32q_s8(int8x16_t vec); // VREV32.8 q0,q0 -_NEON2SSE_INLINE int8x16_t vrev32q_s8(int8x16_t vec) // VREV32.8 q0,q0 -{ - _NEON2SSE_ALIGN_16 int8_t mask_rev_e8[16] = {3,2,1,0, 7,6,5,4, 11,10,9,8, 15,14,13,12}; - return _mm_shuffle_epi8 (vec, *(__m128i*) mask_rev_e8); -} - -int16x8_t vrev32q_s16(int16x8_t vec); // VREV32.16 q0,q0 -_NEON2SSE_INLINE int16x8_t vrev32q_s16(int16x8_t vec) // VREV32.16 q0,q0 -{ - _NEON2SSE_ALIGN_16 int8_t mask_rev_e8[16] = {2,3,0,1, 6,7, 4,5, 10,11, 8,9, 14,15,12,13}; - return _mm_shuffle_epi8 (vec, *(__m128i*) mask_rev_e8); -} - -uint8x16_t vrev32q_u8(uint8x16_t vec); // VREV32.8 q0,q0 -#define vrev32q_u8 vrev32q_s8 - -uint16x8_t vrev32q_u16(uint16x8_t vec); // VREV32.16 q0,q0 -#define vrev32q_u16 vrev32q_s16 - -poly8x16_t vrev32q_p8(poly8x16_t vec); // VREV32.8 q0,q0 -#define vrev32q_p8 vrev32q_u8 - -poly16x8_t vrev32q_p16(poly16x8_t vec); // VREV32.16 q0,q0 -#define vrev32q_p16 vrev32q_u16 - -//************* 16 bit shuffles ********************** -//****************************************************** -int8x8_t vrev16_s8(int8x8_t vec); // VREV16.8 d0,d0 -_NEON2SSE_INLINE int8x8_t vrev16_s8(int8x8_t vec) -{ - int8x8_t res64; - __m128i res; - res = vrev16q_s8(_pM128i(vec)); - return64(res); -} - -uint8x8_t vrev16_u8(uint8x8_t vec); // VREV16.8 d0,d0 -#define vrev16_u8 vrev16_s8 - -poly8x8_t vrev16_p8(poly8x8_t vec); // VREV16.8 d0,d0 -#define vrev16_p8 vrev16_u8 - -int8x16_t vrev16q_s8(int8x16_t vec); // VREV16.8 q0,q0 -_NEON2SSE_INLINE int8x16_t vrev16q_s8(int8x16_t vec) // VREV16.8 q0,q0 -{ - _NEON2SSE_ALIGN_16 int8_t mask_rev8[16] = {1,0, 3,2, 5,4, 7,6, 9,8, 11, 10, 13, 12, 15, 14}; - return _mm_shuffle_epi8 (vec, *(__m128i*) mask_rev8); -} - -uint8x16_t vrev16q_u8(uint8x16_t vec); // VREV16.8 q0,q0 -#define vrev16q_u8 vrev16q_s8 - -poly8x16_t vrev16q_p8(poly8x16_t vec); // VREV16.8 q0,q0 -#define vrev16q_p8 vrev16q_u8 - -//********************************************************************* -//**************** Other single operand arithmetic ******************* -//********************************************************************* - -//*********** Absolute: Vd[i] = |Va[i]| ********************************** -//************************************************************************ -int8x8_t vabs_s8(int8x8_t a); // VABS.S8 d0,d0 -_NEON2SSE_INLINE int8x8_t vabs_s8(int8x8_t a) -{ - int8x8_t res64; - __m128i res; - res = _mm_abs_epi8(_pM128i(a)); - return64(res); -} - - -int16x4_t vabs_s16(int16x4_t a); // VABS.S16 d0,d0 -_NEON2SSE_INLINE int16x4_t vabs_s16(int16x4_t a) -{ - int16x4_t res64; - __m128i res; - res = _mm_abs_epi16(_pM128i(a)); - return64(res); -} - -int32x2_t vabs_s32(int32x2_t a); // VABS.S32 d0,d0 -_NEON2SSE_INLINE int32x2_t vabs_s32(int32x2_t a) -{ - int32x2_t res64; - __m128i res; - res = _mm_abs_epi32(_pM128i(a)); - return64(res); -} - -float32x2_t vabs_f32(float32x2_t a); // VABS.F32 d0,d0 -_NEON2SSE_INLINE float32x2_t vabs_f32(float32x2_t a) // VABS.F32 d0,d0 -{ - float32x4_t res; - __m64_128 res64; - _NEON2SSE_ALIGN_16 int32_t c7fffffff[4] = {0x7fffffff, 0x7fffffff, 0x7fffffff, 0x7fffffff}; - res = _mm_and_ps (_pM128(a), *(__m128*)c7fffffff); //use 64 low bits only - _M64f(res64, res); - return res64; -} - -int8x16_t vabsq_s8(int8x16_t a); // VABS.S8 q0,q0 -#define vabsq_s8 _mm_abs_epi8 - -int16x8_t vabsq_s16(int16x8_t a); // VABS.S16 q0,q0 -#define vabsq_s16 _mm_abs_epi16 - -int32x4_t vabsq_s32(int32x4_t a); // VABS.S32 q0,q0 -#define vabsq_s32 _mm_abs_epi32 - -float32x4_t vabsq_f32(float32x4_t a); // VABS.F32 q0,q0 -_NEON2SSE_INLINE float32x4_t vabsq_f32(float32x4_t a) // VABS.F32 q0,q0 -{ - _NEON2SSE_ALIGN_16 int32_t c7fffffff[4] = {0x7fffffff, 0x7fffffff, 0x7fffffff, 0x7fffffff}; - return _mm_and_ps (a, *(__m128*)c7fffffff); -} - -//****** Saturating absolute: Vd[i] = sat(|Va[i]|) ********************* -//********************************************************************** -//For signed-integer data types, the absolute value of the most negative value is not representable by the data type, saturation takes place -int8x8_t vqabs_s8(int8x8_t a); // VQABS.S8 d0,d0 -_NEON2SSE_INLINE int8x8_t vqabs_s8(int8x8_t a) -{ - int8x8_t res64; - __m128i res; - res = vqabsq_s8(_pM128i(a)); - return64(res); -} - -int16x4_t vqabs_s16(int16x4_t a); // VQABS.S16 d0,d0 -_NEON2SSE_INLINE int16x4_t vqabs_s16(int16x4_t a) -{ - int16x4_t res64; - __m128i res; - res = vqabsq_s16(_pM128i(a)); - return64(res); -} - -int32x2_t vqabs_s32(int32x2_t a); // VQABS.S32 d0,d0 -_NEON2SSE_INLINE int32x2_t vqabs_s32(int32x2_t a) -{ - int32x2_t res64; - __m128i res; - res = vqabsq_s32(_pM128i(a)); - return64(res); -} - -int8x16_t vqabsq_s8(int8x16_t a); // VQABS.S8 q0,q0 -_NEON2SSE_INLINE int8x16_t vqabsq_s8(int8x16_t a) // VQABS.S8 q0,q0 -{ - __m128i c_128, abs, abs_cmp; - c_128 = _mm_set1_epi8 (0x80); //-128 - abs = _mm_abs_epi8 (a); - abs_cmp = _mm_cmpeq_epi8 (abs, c_128); - return _mm_xor_si128 (abs, abs_cmp); -} - -int16x8_t vqabsq_s16(int16x8_t a); // VQABS.S16 q0,q0 -_NEON2SSE_INLINE int16x8_t vqabsq_s16(int16x8_t a) // VQABS.S16 q0,q0 -{ - __m128i c_32768, abs, abs_cmp; - c_32768 = _mm_set1_epi16 (0x8000); //-32768 - abs = _mm_abs_epi16 (a); - abs_cmp = _mm_cmpeq_epi16 (abs, c_32768); - return _mm_xor_si128 (abs, abs_cmp); -} - -int32x4_t vqabsq_s32(int32x4_t a); // VQABS.S32 q0,q0 -_NEON2SSE_INLINE int32x4_t vqabsq_s32(int32x4_t a) // VQABS.S32 q0,q0 -{ - __m128i c80000000, abs, abs_cmp; - c80000000 = _mm_set1_epi32 (0x80000000); //most negative value - abs = _mm_abs_epi32 (a); - abs_cmp = _mm_cmpeq_epi32 (abs, c80000000); - return _mm_xor_si128 (abs, abs_cmp); -} - -//*************** Negate: Vd[i] = - Va[i] ************************************* -//***************************************************************************** -//several Negate implementations possible for SIMD. -//e.//function _mm_sign function(a, negative numbers vector), but the following one gives good performance: -int8x8_t vneg_s8(int8x8_t a); // VNE//d0,d0 -_NEON2SSE_INLINE int8x8_t vneg_s8(int8x8_t a) -{ - int8x8_t res64; - __m128i res; - res = vnegq_s8(_pM128i(a)); - return64(res); -} - -int16x4_t vneg_s16(int16x4_t a); // VNE//d0,d0 -_NEON2SSE_INLINE int16x4_t vneg_s16(int16x4_t a) -{ - int16x4_t res64; - __m128i res; - res = vnegq_s16(_pM128i(a)); - return64(res); -} - -int32x2_t vneg_s32(int32x2_t a); // VNE//d0,d0 -_NEON2SSE_INLINE int32x2_t vneg_s32(int32x2_t a) -{ - int32x2_t res64; - __m128i res; - res = vnegq_s32(_pM128i(a)); - return64(res); -} - -float32x2_t vneg_f32(float32x2_t a); // VNE//d0,d0 -_NEON2SSE_INLINE float32x2_t vneg_f32(float32x2_t a) // VNE//d0,d0 -{ - float32x4_t res; - __m64_128 res64; - _NEON2SSE_ALIGN_16 int32_t c80000000[4] = {0x80000000, 0x80000000, 0x80000000, 0x80000000}; - res = _mm_xor_ps (_pM128(a), *(__m128*) c80000000); //use low 64 bits - _M64f(res64, res); - return res64; -} - -int8x16_t vnegq_s8(int8x16_t a); // VNE//q0,q0 -_NEON2SSE_INLINE int8x16_t vnegq_s8(int8x16_t a) // VNE//q0,q0 -{ - __m128i zero; - zero = _mm_setzero_si128 (); - return _mm_sub_epi8 (zero, a); -} //or _mm_sign_epi8 (a, negative numbers vector) - -int16x8_t vnegq_s16(int16x8_t a); // VNE//q0,q0 -_NEON2SSE_INLINE int16x8_t vnegq_s16(int16x8_t a) // VNE//q0,q0 -{ - __m128i zero; - zero = _mm_setzero_si128 (); - return _mm_sub_epi16 (zero, a); -} //or _mm_sign_epi16 (a, negative numbers vector) - -int32x4_t vnegq_s32(int32x4_t a); // VNE//q0,q0 -_NEON2SSE_INLINE int32x4_t vnegq_s32(int32x4_t a) // VNE//q0,q0 -{ - __m128i zero; - zero = _mm_setzero_si128 (); - return _mm_sub_epi32 (zero, a); -} //or _mm_sign_epi32 (a, negative numbers vector) - -float32x4_t vnegq_f32(float32x4_t a); // VNE//q0,q0 -_NEON2SSE_INLINE float32x4_t vnegq_f32(float32x4_t a) // VNE//q0,q0 -{ - _NEON2SSE_ALIGN_16 int32_t c80000000[4] = {0x80000000, 0x80000000, 0x80000000, 0x80000000}; - return _mm_xor_ps (a, *(__m128*) c80000000); -} - -//************** Saturating Negate: sat(Vd[i] = - Va[i]) ************************** -//*************************************************************************************** -//For signed-integer data types, the negation of the most negative value can't be produced without saturation, while with saturation it is max positive -int8x8_t vqneg_s8(int8x8_t a); // VQNE//d0,d0 -_NEON2SSE_INLINE int8x8_t vqneg_s8(int8x8_t a) -{ - int8x8_t res64; - __m128i res; - res = vqnegq_s8(_pM128i(a)); - return64(res); -} - -int16x4_t vqneg_s16(int16x4_t a); // VQNE//d0,d0 -_NEON2SSE_INLINE int16x4_t vqneg_s16(int16x4_t a) -{ - int16x4_t res64; - __m128i res; - res = vqnegq_s16(_pM128i(a)); - return64(res); -} - -int32x2_t vqneg_s32(int32x2_t a); // VQNE//d0,d0 -_NEON2SSE_INLINE int32x2_t vqneg_s32(int32x2_t a) -{ - int32x2_t res64; - __m128i res; - res = vqnegq_s32(_pM128i(a)); - return64(res); -} - -int8x16_t vqnegq_s8(int8x16_t a); // VQNE//q0,q0 -_NEON2SSE_INLINE int8x16_t vqnegq_s8(int8x16_t a) // VQNE//q0,q0 -{ - __m128i zero; - zero = _mm_setzero_si128 (); - return _mm_subs_epi8 (zero, a); //saturating substraction -} - -int16x8_t vqnegq_s16(int16x8_t a); // VQNE//q0,q0 -_NEON2SSE_INLINE int16x8_t vqnegq_s16(int16x8_t a) // VQNE//q0,q0 -{ - __m128i zero; - zero = _mm_setzero_si128 (); - return _mm_subs_epi16 (zero, a); //saturating substraction -} - -int32x4_t vqnegq_s32(int32x4_t a); // VQNE//q0,q0 -_NEON2SSE_INLINE int32x4_t vqnegq_s32(int32x4_t a) // VQNE//q0,q0 -{ - //solution may be not optimal compared with a serial - __m128i c80000000, zero, sub, cmp; - c80000000 = _mm_set1_epi32 (0x80000000); //most negative value - zero = _mm_setzero_si128 (); - sub = _mm_sub_epi32 (zero, a); //substraction - cmp = _mm_cmpeq_epi32 (a, c80000000); - return _mm_xor_si128 (sub, cmp); -} - -//****************** Count leading zeros ******************************** -//************************************************************************** -//no corresponding vector intrinsics in IA32, need to implement it. While the implementation is effective for 8 bits, it may be not for 16 and 32 bits -int8x8_t vclz_s8(int8x8_t a); // VCLZ.I8 d0,d0 -_NEON2SSE_INLINE int8x8_t vclz_s8(int8x8_t a) -{ - int8x8_t res64; - __m128i res; - res = vclzq_s8(_pM128i(a)); - return64(res); -} - -int16x4_t vclz_s16(int16x4_t a); // VCLZ.I16 d0,d0 -_NEON2SSE_INLINE int16x4_t vclz_s16(int16x4_t a) -{ - int16x4_t res64; - __m128i res; - res = vclzq_s16(_pM128i(a)); - return64(res); -} - -int32x2_t vclz_s32(int32x2_t a); // VCLZ.I32 d0,d0 -_NEON2SSE_INLINE int32x2_t vclz_s32(int32x2_t a) -{ - int32x2_t res64; - __m128i res; - res = vclzq_s32(_pM128i(a)); - return64(res); -} - - -uint8x8_t vclz_u8(uint8x8_t a); // VCLZ.I8 d0,d0 -#define vclz_u8 vclz_s8 - -uint16x4_t vclz_u16(uint16x4_t a); // VCLZ.I16 d0,d0 -#define vclz_u16 vclz_s16 - -uint32x2_t vclz_u32(uint32x2_t a); // VCLZ.I32 d0,d0 -#define vclz_u32 vclz_s32 - -int8x16_t vclzq_s8(int8x16_t a); // VCLZ.I8 q0,q0 -_NEON2SSE_INLINE int8x16_t vclzq_s8(int8x16_t a) -{ - _NEON2SSE_ALIGN_16 int8_t mask_CLZ[16] = { /* 0 */ 4,/* 1 */ 3,/* 2 */ 2,/* 3 */ 2, - /* 4 */ 1,/* 5 */ 1,/* 6 */ 1,/* 7 */ 1, - /* 8 */ 0,/* 9 */ 0,/* a */ 0,/* b */ 0, - /* c */ 0,/* d */ 0,/* e */ 0,/* f */ 0 }; - __m128i maskLOW, c4, lowclz, mask, hiclz; - maskLOW = _mm_set1_epi8(0x0f); //low 4 bits, don't need masking low to avoid zero if MSB is set - it happens automatically - c4 = _mm_set1_epi8(4); - lowclz = _mm_shuffle_epi8( *(__m128i*)mask_CLZ, a); //uses low 4 bits anyway - mask = _mm_srli_epi16(a, 4); //get high 4 bits as low bits - mask = _mm_and_si128(mask, maskLOW); //low 4 bits, need masking to avoid zero if MSB is set - hiclz = _mm_shuffle_epi8( *(__m128i*) mask_CLZ, mask); //uses low 4 bits anyway - mask = _mm_cmpeq_epi8(hiclz, c4); // shows the need to add lowclz zeros - lowclz = _mm_and_si128(lowclz,mask); - return _mm_add_epi8(lowclz, hiclz); -} - -int16x8_t vclzq_s16(int16x8_t a); // VCLZ.I16 q0,q0 -_NEON2SSE_INLINE int16x8_t vclzq_s16(int16x8_t a) -{ - __m128i c7, res8x16, res8x16_swap; - _NEON2SSE_ALIGN_16 int8_t mask8_sab[16] = { 1, 0, 3, 2, 5, 4, 7, 6, 9, 8, 11, 10, 13, 12, 15, 14}; - _NEON2SSE_ALIGN_16 uint16_t mask8bit[8] = {0x00ff, 0x00ff, 0x00ff, 0x00ff,0x00ff, 0x00ff, 0x00ff, 0x00ff}; - c7 = _mm_srli_epi16(*(__m128i*)mask8bit, 5); //7 - res8x16 = vclzq_s8(a); - res8x16_swap = _mm_shuffle_epi8 (res8x16, *(__m128i*) mask8_sab); //horisontal pairs swap - res8x16 = _mm_and_si128(res8x16, *(__m128i*)mask8bit); //lowclz - res8x16_swap = _mm_and_si128(res8x16_swap, *(__m128i*)mask8bit); //hiclz - c7 = _mm_cmpgt_epi16(res8x16_swap, c7); // shows the need to add lowclz zeros - res8x16 = _mm_and_si128(res8x16, c7); //lowclz - return _mm_add_epi16(res8x16_swap, res8x16); -} - -int32x4_t vclzq_s32(int32x4_t a); // VCLZ.I32 q0,q0 -_NEON2SSE_INLINE int32x4_t vclzq_s32(int32x4_t a) -{ - __m128i c55555555, c33333333, c0f0f0f0f, c3f, c32, tmp, tmp1, res; - c55555555 = _mm_set1_epi32(0x55555555); - c33333333 = _mm_set1_epi32(0x33333333); - c0f0f0f0f = _mm_set1_epi32(0x0f0f0f0f); - c3f = _mm_set1_epi32(0x3f); - c32 = _mm_set1_epi32(32); - tmp = _mm_srli_epi32(a, 1); - res = _mm_or_si128(tmp, a); //atmp[i] |= (atmp[i] >> 1); - tmp = _mm_srli_epi32(res, 2); - res = _mm_or_si128(tmp, res); //atmp[i] |= (atmp[i] >> 2); - tmp = _mm_srli_epi32(res, 4); - res = _mm_or_si128(tmp, res); //atmp[i] |= (atmp[i] >> 4); - tmp = _mm_srli_epi32(res, 8); - res = _mm_or_si128(tmp, res); //atmp[i] |= (atmp[i] >> 8); - tmp = _mm_srli_epi32(res, 16); - res = _mm_or_si128(tmp, res); //atmp[i] |= (atmp[i] >> 16); - - tmp = _mm_srli_epi32(res, 1); - tmp = _mm_and_si128(tmp, c55555555); - res = _mm_sub_epi32(res, tmp); //atmp[i] -= ((atmp[i] >> 1) & 0x55555555); - - tmp = _mm_srli_epi32(res, 2); - tmp = _mm_and_si128(tmp, c33333333); - tmp1 = _mm_and_si128(res, c33333333); - res = _mm_add_epi32(tmp, tmp1); //atmp[i] = (((atmp[i] >> 2) & 0x33333333) + (atmp[i] & 0x33333333)); - - tmp = _mm_srli_epi32(res, 4); - tmp = _mm_add_epi32(tmp, res); - res = _mm_and_si128(tmp, c0f0f0f0f); //atmp[i] = (((atmp[i] >> 4) + atmp[i]) & 0x0f0f0f0f); - - tmp = _mm_srli_epi32(res, 8); - res = _mm_add_epi32(tmp, res); //atmp[i] += (atmp[i] >> 8); - - tmp = _mm_srli_epi32(res, 16); - res = _mm_add_epi32(tmp, res); //atmp[i] += (atmp[i] >> 16); - - res = _mm_and_si128(res, c3f); //atmp[i] = atmp[i] & 0x0000003f; - - return _mm_sub_epi32(c32, res); //res[i] = 32 - atmp[i]; -} - -uint8x16_t vclzq_u8(uint8x16_t a); // VCLZ.I8 q0,q0 -#define vclzq_u8 vclzq_s8 - -uint16x8_t vclzq_u16(uint16x8_t a); // VCLZ.I16 q0,q0 -#define vclzq_u16 vclzq_s16 - -uint32x4_t vclzq_u32(uint32x4_t a); // VCLZ.I32 q0,q0 -#define vclzq_u32 vclzq_s32 - -//************** Count leading sign bits ************************** -//******************************************************************** -//VCLS (Vector Count Leading Sign bits) counts the number of consecutive bits following -// the topmost bit, that are the same as the topmost bit, in each element in a vector -//No corresponding vector intrinsics in IA32, need to implement it. -//While the implementation is effective for 8 bits, it may be not for 16 and 32 bits -int8x8_t vcls_s8(int8x8_t a); // VCLS.S8 d0,d0 -_NEON2SSE_INLINE int8x8_t vcls_s8(int8x8_t a) -{ - int8x8_t res64; - __m128i res; - res = vclsq_s8(_pM128i(a)); - return64(res); -} - -int16x4_t vcls_s16(int16x4_t a); // VCLS.S16 d0,d0 -_NEON2SSE_INLINE int16x4_t vcls_s16(int16x4_t a) -{ - int16x4_t res64; - __m128i res; - res = vclsq_s16(_pM128i(a)); - return64(res); -} - -int32x2_t vcls_s32(int32x2_t a); // VCLS.S32 d0,d0 -_NEON2SSE_INLINE int32x2_t vcls_s32(int32x2_t a) -{ - int32x2_t res64; - __m128i res; - res = vclsq_s32(_pM128i(a)); - return64(res); -} - -int8x16_t vclsq_s8(int8x16_t a); // VCLS.S8 q0,q0 -_NEON2SSE_INLINE int8x16_t vclsq_s8(int8x16_t a) -{ - __m128i cff, c80, c1, a_mask, a_neg, a_pos, a_comb; - cff = _mm_cmpeq_epi8 (a,a); //0xff - c80 = _mm_set1_epi8(0x80); - c1 = _mm_set1_epi8(1); - a_mask = _mm_and_si128(a, c80); - a_mask = _mm_cmpeq_epi8(a_mask, c80); //0xff if negative input and 0 if positive - a_neg = _mm_xor_si128(a, cff); - a_neg = _mm_and_si128(a_mask, a_neg); - a_pos = _mm_andnot_si128(a_mask, a); - a_comb = _mm_or_si128(a_pos, a_neg); - a_comb = vclzq_s8(a_comb); - return _mm_sub_epi8(a_comb, c1); -} - -int16x8_t vclsq_s16(int16x8_t a); // VCLS.S16 q0,q0 -_NEON2SSE_INLINE int16x8_t vclsq_s16(int16x8_t a) -{ - __m128i cffff, c8000, c1, a_mask, a_neg, a_pos, a_comb; - cffff = _mm_cmpeq_epi16(a,a); - c8000 = _mm_slli_epi16(cffff, 15); //0x8000 - c1 = _mm_srli_epi16(cffff,15); //0x1 - a_mask = _mm_and_si128(a, c8000); - a_mask = _mm_cmpeq_epi16(a_mask, c8000); //0xffff if negative input and 0 if positive - a_neg = _mm_xor_si128(a, cffff); - a_neg = _mm_and_si128(a_mask, a_neg); - a_pos = _mm_andnot_si128(a_mask, a); - a_comb = _mm_or_si128(a_pos, a_neg); - a_comb = vclzq_s16(a_comb); - return _mm_sub_epi16(a_comb, c1); -} - -int32x4_t vclsq_s32(int32x4_t a); // VCLS.S32 q0,q0 -_NEON2SSE_INLINE int32x4_t vclsq_s32(int32x4_t a) -{ - __m128i cffffffff, c80000000, c1, a_mask, a_neg, a_pos, a_comb; - cffffffff = _mm_cmpeq_epi32(a,a); - c80000000 = _mm_slli_epi32(cffffffff, 31); //0x80000000 - c1 = _mm_srli_epi32(cffffffff,31); //0x1 - a_mask = _mm_and_si128(a, c80000000); - a_mask = _mm_cmpeq_epi32(a_mask, c80000000); //0xffffffff if negative input and 0 if positive - a_neg = _mm_xor_si128(a, cffffffff); - a_neg = _mm_and_si128(a_mask, a_neg); - a_pos = _mm_andnot_si128(a_mask, a); - a_comb = _mm_or_si128(a_pos, a_neg); - a_comb = vclzq_s32(a_comb); - return _mm_sub_epi32(a_comb, c1); -} - -//************************* Count number of set bits ******************************** -//************************************************************************************* -//No corresponding SIMD solution. One option is to get a elements, convert it to 32 bits and then use SSE4.2 _mm_popcnt__u32 (unsigned int v) for each element -//another option is to do the following algorithm: - -uint8x8_t vcnt_u8(uint8x8_t a); // VCNT.8 d0,d0 -_NEON2SSE_INLINE uint8x8_t vcnt_u8(uint8x8_t a) -{ - uint8x8_t res64; - __m128i res; - res = vcntq_u8(_pM128i(a)); - return64(res); -} - -int8x8_t vcnt_s8(int8x8_t a); // VCNT.8 d0,d0 -#define vcnt_s8 vcnt_u8 - -poly8x8_t vcnt_p8(poly8x8_t a); // VCNT.8 d0,d0 -#define vcnt_p8 vcnt_u8 - -uint8x16_t vcntq_u8(uint8x16_t a); // VCNT.8 q0,q0 -_NEON2SSE_INLINE uint8x16_t vcntq_u8(uint8x16_t a) -{ - _NEON2SSE_ALIGN_16 int8_t mask_POPCOUNT[16] = { /* 0 */ 0,/* 1 */ 1,/* 2 */ 1,/* 3 */ 2, - /* 4 */ 1,/* 5 */ 2,/* 6 */ 2,/* 7 */ 3, - /* 8 */ 1,/* 9 */ 2,/* a */ 2,/* b */ 3, - /* c */ 2,/* d */ 3,/* e */ 3,/* f */ 4 }; - __m128i maskLOW, mask, lowpopcnt, hipopcnt; - maskLOW = _mm_set1_epi8(0x0f); //low 4 bits, need masking to avoid zero if MSB is set - mask = _mm_and_si128(a, maskLOW); - lowpopcnt = _mm_shuffle_epi8( *(__m128i*)mask_POPCOUNT, mask); //uses low 4 bits anyway - mask = _mm_srli_epi16(a, 4); //get high 4 bits as low bits - mask = _mm_and_si128(mask, maskLOW); //low 4 bits, need masking to avoid zero if MSB is set - hipopcnt = _mm_shuffle_epi8( *(__m128i*) mask_POPCOUNT, mask); //uses low 4 bits anyway - return _mm_add_epi8(lowpopcnt, hipopcnt); -} - -int8x16_t vcntq_s8(int8x16_t a); // VCNT.8 q0,q0 -#define vcntq_s8 vcntq_u8 - -poly8x16_t vcntq_p8(poly8x16_t a); // VCNT.8 q0,q0 -#define vcntq_p8 vcntq_u8 - -//************************************************************************************** -//*********************** Logical operations **************************************** -//************************************************************************************** -//************************** Bitwise not *********************************** -//several Bitwise not implementations possible for SIMD. Eg "xor" with all ones, but the following one gives good performance -int8x8_t vmvn_s8(int8x8_t a); // VMVN d0,d0 -_NEON2SSE_INLINE int8x8_t vmvn_s8(int8x8_t a) -{ - int8x8_t res64; - __m128i res; - res = vmvnq_s8(_pM128i(a)); - return64(res); -} - -int16x4_t vmvn_s16(int16x4_t a); // VMVN d0,d0 -_NEON2SSE_INLINE int16x4_t vmvn_s16(int16x4_t a) -{ - int16x4_t res64; - __m128i res; - res = vmvnq_s16(_pM128i(a)); - return64(res); -} - -int32x2_t vmvn_s32(int32x2_t a); // VMVN d0,d0 -_NEON2SSE_INLINE int32x2_t vmvn_s32(int32x2_t a) -{ - int32x2_t res64; - __m128i res; - res = vmvnq_s32(_pM128i(a)); - return64(res); -} - -uint8x8_t vmvn_u8(uint8x8_t a); // VMVN d0,d0 -#define vmvn_u8 vmvn_s8 - -uint16x4_t vmvn_u16(uint16x4_t a); // VMVN d0,d0 -#define vmvn_u16 vmvn_s16 - -uint32x2_t vmvn_u32(uint32x2_t a); // VMVN d0,d0 -#define vmvn_u32 vmvn_s32 - -poly8x8_t vmvn_p8(poly8x8_t a); // VMVN d0,d0 -#define vmvn_p8 vmvn_u8 - -int8x16_t vmvnq_s8(int8x16_t a); // VMVN q0,q0 -_NEON2SSE_INLINE int8x16_t vmvnq_s8(int8x16_t a) // VMVN q0,q0 -{ - __m128i c1; - c1 = _mm_cmpeq_epi8 (a,a); //0xff - return _mm_andnot_si128 (a, c1); -} - -int16x8_t vmvnq_s16(int16x8_t a); // VMVN q0,q0 -_NEON2SSE_INLINE int16x8_t vmvnq_s16(int16x8_t a) // VMVN q0,q0 -{ - __m128i c1; - c1 = _mm_cmpeq_epi16 (a,a); //0xffff - return _mm_andnot_si128 (a, c1); -} - -int32x4_t vmvnq_s32(int32x4_t a); // VMVN q0,q0 -_NEON2SSE_INLINE int32x4_t vmvnq_s32(int32x4_t a) // VMVN q0,q0 -{ - __m128i c1; - c1 = _mm_cmpeq_epi32 (a,a); //0xffffffff - return _mm_andnot_si128 (a, c1); -} - -uint8x16_t vmvnq_u8(uint8x16_t a); // VMVN q0,q0 -#define vmvnq_u8 vmvnq_s8 - -uint16x8_t vmvnq_u16(uint16x8_t a); // VMVN q0,q0 -#define vmvnq_u16 vmvnq_s16 - -uint32x4_t vmvnq_u32(uint32x4_t a); // VMVN q0,q0 -#define vmvnq_u32 vmvnq_s32 - -poly8x16_t vmvnq_p8(poly8x16_t a); // VMVN q0,q0 -#define vmvnq_p8 vmvnq_u8 - -//****************** Bitwise and *********************** -//****************************************************** -int8x8_t vand_s8(int8x8_t a, int8x8_t b); // VAND d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vand_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(_mm_and_si128(_pM128i(a),_pM128i(b))); -} - -int16x4_t vand_s16(int16x4_t a, int16x4_t b); // VAND d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vand_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(_mm_and_si128(_pM128i(a),_pM128i(b))); -} - -int32x2_t vand_s32(int32x2_t a, int32x2_t b); // VAND d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vand_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(_mm_and_si128(_pM128i(a),_pM128i(b))); -} - - -int64x1_t vand_s64(int64x1_t a, int64x1_t b); // VAND d0,d0,d0 -_NEON2SSE_INLINE int64x1_t vand_s64(int64x1_t a, int64x1_t b) -{ - int64x1_t res; - res.m64_i64[0] = a.m64_i64[0] & b.m64_i64[0]; - return res; -} - -uint8x8_t vand_u8(uint8x8_t a, uint8x8_t b); // VAND d0,d0,d0 -#define vand_u8 vand_s8 - -uint16x4_t vand_u16(uint16x4_t a, uint16x4_t b); // VAND d0,d0,d0 -#define vand_u16 vand_s16 - -uint32x2_t vand_u32(uint32x2_t a, uint32x2_t b); // VAND d0,d0,d0 -#define vand_u32 vand_s32 - -uint64x1_t vand_u64(uint64x1_t a, uint64x1_t b); // VAND d0,d0,d0 -#define vand_u64 vand_s64 - - -int8x16_t vandq_s8(int8x16_t a, int8x16_t b); // VAND q0,q0,q0 -#define vandq_s8 _mm_and_si128 - -int16x8_t vandq_s16(int16x8_t a, int16x8_t b); // VAND q0,q0,q0 -#define vandq_s16 _mm_and_si128 - -int32x4_t vandq_s32(int32x4_t a, int32x4_t b); // VAND q0,q0,q0 -#define vandq_s32 _mm_and_si128 - -int64x2_t vandq_s64(int64x2_t a, int64x2_t b); // VAND q0,q0,q0 -#define vandq_s64 _mm_and_si128 - -uint8x16_t vandq_u8(uint8x16_t a, uint8x16_t b); // VAND q0,q0,q0 -#define vandq_u8 _mm_and_si128 - -uint16x8_t vandq_u16(uint16x8_t a, uint16x8_t b); // VAND q0,q0,q0 -#define vandq_u16 _mm_and_si128 - -uint32x4_t vandq_u32(uint32x4_t a, uint32x4_t b); // VAND q0,q0,q0 -#define vandq_u32 _mm_and_si128 - -uint64x2_t vandq_u64(uint64x2_t a, uint64x2_t b); // VAND q0,q0,q0 -#define vandq_u64 _mm_and_si128 - -//******************** Bitwise or ********************************* -//****************************************************************** -int8x8_t vorr_s8(int8x8_t a, int8x8_t b); // VORR d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vorr_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(_mm_or_si128(_pM128i(a),_pM128i(b))); -} - - -int16x4_t vorr_s16(int16x4_t a, int16x4_t b); // VORR d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vorr_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(_mm_or_si128(_pM128i(a),_pM128i(b))); -} - - -int32x2_t vorr_s32(int32x2_t a, int32x2_t b); // VORR d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vorr_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(_mm_or_si128(_pM128i(a),_pM128i(b))); -} - - -int64x1_t vorr_s64(int64x1_t a, int64x1_t b); // VORR d0,d0,d0 -_NEON2SSE_INLINE int64x1_t vorr_s64(int64x1_t a, int64x1_t b) -{ - int64x1_t res; - res.m64_i64[0] = a.m64_i64[0] | b.m64_i64[0]; - return res; -} - -uint8x8_t vorr_u8(uint8x8_t a, uint8x8_t b); // VORR d0,d0,d0 -#define vorr_u8 vorr_s8 - -uint16x4_t vorr_u16(uint16x4_t a, uint16x4_t b); // VORR d0,d0,d0 -#define vorr_u16 vorr_s16 - -uint32x2_t vorr_u32(uint32x2_t a, uint32x2_t b); // VORR d0,d0,d0 -#define vorr_u32 vorr_s32 - -uint64x1_t vorr_u64(uint64x1_t a, uint64x1_t b); // VORR d0,d0,d0 -#define vorr_u64 vorr_s64 - -int8x16_t vorrq_s8(int8x16_t a, int8x16_t b); // VORR q0,q0,q0 -#define vorrq_s8 _mm_or_si128 - -int16x8_t vorrq_s16(int16x8_t a, int16x8_t b); // VORR q0,q0,q0 -#define vorrq_s16 _mm_or_si128 - -int32x4_t vorrq_s32(int32x4_t a, int32x4_t b); // VORR q0,q0,q0 -#define vorrq_s32 _mm_or_si128 - -int64x2_t vorrq_s64(int64x2_t a, int64x2_t b); // VORR q0,q0,q0 -#define vorrq_s64 _mm_or_si128 - -uint8x16_t vorrq_u8(uint8x16_t a, uint8x16_t b); // VORR q0,q0,q0 -#define vorrq_u8 _mm_or_si128 - -uint16x8_t vorrq_u16(uint16x8_t a, uint16x8_t b); // VORR q0,q0,q0 -#define vorrq_u16 _mm_or_si128 - -uint32x4_t vorrq_u32(uint32x4_t a, uint32x4_t b); // VORR q0,q0,q0 -#define vorrq_u32 _mm_or_si128 - -uint64x2_t vorrq_u64(uint64x2_t a, uint64x2_t b); // VORR q0,q0,q0 -#define vorrq_u64 _mm_or_si128 - -//************* Bitwise exclusive or (EOR or XOR) ****************** -//******************************************************************* -int8x8_t veor_s8(int8x8_t a, int8x8_t b); // VEOR d0,d0,d0 -_NEON2SSE_INLINE int8x8_t veor_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(_mm_xor_si128(_pM128i(a),_pM128i(b))); -} - -int16x4_t veor_s16(int16x4_t a, int16x4_t b); // VEOR d0,d0,d0 -#define veor_s16 veor_s8 - -int32x2_t veor_s32(int32x2_t a, int32x2_t b); // VEOR d0,d0,d0 -#define veor_s32 veor_s8 - -int64x1_t veor_s64(int64x1_t a, int64x1_t b); // VEOR d0,d0,d0 -_NEON2SSE_INLINE int64x1_t veor_s64(int64x1_t a, int64x1_t b) -{ - int64x1_t res; - res.m64_i64[0] = a.m64_i64[0] ^ b.m64_i64[0]; - return res; -} - -uint8x8_t veor_u8(uint8x8_t a, uint8x8_t b); // VEOR d0,d0,d0 -#define veor_u8 veor_s8 - -uint16x4_t veor_u16(uint16x4_t a, uint16x4_t b); // VEOR d0,d0,d0 -#define veor_u16 veor_s16 - -uint32x2_t veor_u32(uint32x2_t a, uint32x2_t b); // VEOR d0,d0,d0 -#define veor_u32 veor_s32 - -uint64x1_t veor_u64(uint64x1_t a, uint64x1_t b); // VEOR d0,d0,d0 -#define veor_u64 veor_s64 - -int8x16_t veorq_s8(int8x16_t a, int8x16_t b); // VEOR q0,q0,q0 -#define veorq_s8 _mm_xor_si128 - -int16x8_t veorq_s16(int16x8_t a, int16x8_t b); // VEOR q0,q0,q0 -#define veorq_s16 _mm_xor_si128 - -int32x4_t veorq_s32(int32x4_t a, int32x4_t b); // VEOR q0,q0,q0 -#define veorq_s32 _mm_xor_si128 - -int64x2_t veorq_s64(int64x2_t a, int64x2_t b); // VEOR q0,q0,q0 -#define veorq_s64 _mm_xor_si128 - -uint8x16_t veorq_u8(uint8x16_t a, uint8x16_t b); // VEOR q0,q0,q0 -#define veorq_u8 _mm_xor_si128 - -uint16x8_t veorq_u16(uint16x8_t a, uint16x8_t b); // VEOR q0,q0,q0 -#define veorq_u16 _mm_xor_si128 - -uint32x4_t veorq_u32(uint32x4_t a, uint32x4_t b); // VEOR q0,q0,q0 -#define veorq_u32 _mm_xor_si128 - -uint64x2_t veorq_u64(uint64x2_t a, uint64x2_t b); // VEOR q0,q0,q0 -#define veorq_u64 _mm_xor_si128 - -//********************** Bit Clear ********************************** -//******************************************************************* -//Logical AND complement (AND negation or AND NOT) -int8x8_t vbic_s8(int8x8_t a, int8x8_t b); // VBIC d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vbic_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(_mm_andnot_si128(_pM128i(b),_pM128i(a))); //notice the arguments "swap" -} - -int16x4_t vbic_s16(int16x4_t a, int16x4_t b); // VBIC d0,d0,d0 -#define vbic_s16 vbic_s8 - -int32x2_t vbic_s32(int32x2_t a, int32x2_t b); // VBIC d0,d0,d0 -#define vbic_s32 vbic_s8 - -int64x1_t vbic_s64(int64x1_t a, int64x1_t b); // VBIC d0,d0,d0 -_NEON2SSE_INLINE int64x1_t vbic_s64(int64x1_t a, int64x1_t b) -{ - int64x1_t res; - res.m64_i64[0] = a.m64_i64[0] & (~b.m64_i64[0]); - return res; -} - -uint8x8_t vbic_u8(uint8x8_t a, uint8x8_t b); // VBIC d0,d0,d0 -#define vbic_u8 vbic_s8 - -uint16x4_t vbic_u16(uint16x4_t a, uint16x4_t b); // VBIC d0,d0,d0 -#define vbic_u16 vbic_s16 - -uint32x2_t vbic_u32(uint32x2_t a, uint32x2_t b); // VBIC d0,d0,d0 -#define vbic_u32 vbic_s32 - -uint64x1_t vbic_u64(uint64x1_t a, uint64x1_t b); // VBIC d0,d0,d0 -#define vbic_u64 vbic_s64 - -int8x16_t vbicq_s8(int8x16_t a, int8x16_t b); // VBIC q0,q0,q0 -#define vbicq_s8(a,b) _mm_andnot_si128 (b,a) //notice arguments "swap" - -int16x8_t vbicq_s16(int16x8_t a, int16x8_t b); // VBIC q0,q0,q0 -#define vbicq_s16(a,b) _mm_andnot_si128 (b,a) //notice arguments "swap" - -int32x4_t vbicq_s32(int32x4_t a, int32x4_t b); // VBIC q0,q0,q0 -#define vbicq_s32(a,b) _mm_andnot_si128 (b,a) //notice arguments "swap" - -int64x2_t vbicq_s64(int64x2_t a, int64x2_t b); // VBIC q0,q0,q0 -#define vbicq_s64(a,b) _mm_andnot_si128 (b,a) //notice arguments "swap" - -uint8x16_t vbicq_u8(uint8x16_t a, uint8x16_t b); // VBIC q0,q0,q0 -#define vbicq_u8(a,b) _mm_andnot_si128 (b,a) //notice arguments "swap" - -uint16x8_t vbicq_u16(uint16x8_t a, uint16x8_t b); // VBIC q0,q0,q0 -#define vbicq_u16(a,b) _mm_andnot_si128 (b,a) //notice arguments "swap" - -uint32x4_t vbicq_u32(uint32x4_t a, uint32x4_t b); // VBIC q0,q0,q0 -#define vbicq_u32(a,b) _mm_andnot_si128 (b,a) //notice arguments "swap" - -uint64x2_t vbicq_u64(uint64x2_t a, uint64x2_t b); // VBIC q0,q0,q0 -#define vbicq_u64(a,b) _mm_andnot_si128 (b,a) //notice arguments "swap" - -//**************** Bitwise OR complement ******************************** -//**************************************** ******************************** -//no exact IA 32 match, need to implement it as following -int8x8_t vorn_s8(int8x8_t a, int8x8_t b); // VORN d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vorn_s8(int8x8_t a, int8x8_t b) -{ - int8x8_t res64; - return64(vornq_s8(_pM128i(a), _pM128i(b))); -} - - -int16x4_t vorn_s16(int16x4_t a, int16x4_t b); // VORN d0,d0,d0 -_NEON2SSE_INLINE int16x4_t vorn_s16(int16x4_t a, int16x4_t b) -{ - int16x4_t res64; - return64(vornq_s16(_pM128i(a), _pM128i(b))); -} - - -int32x2_t vorn_s32(int32x2_t a, int32x2_t b); // VORN d0,d0,d0 -_NEON2SSE_INLINE int32x2_t vorn_s32(int32x2_t a, int32x2_t b) -{ - int32x2_t res64; - return64(vornq_s32(_pM128i(a), _pM128i(b))); -} - - -int64x1_t vorn_s64(int64x1_t a, int64x1_t b); // VORN d0,d0,d0 -_NEON2SSE_INLINE int64x1_t vorn_s64(int64x1_t a, int64x1_t b) -{ - int64x1_t res; - res.m64_i64[0] = a.m64_i64[0] | (~b.m64_i64[0]); - return res; -} - -uint8x8_t vorn_u8(uint8x8_t a, uint8x8_t b); // VORN d0,d0,d0 -#define vorn_u8 vorn_s8 - - -uint16x4_t vorn_u16(uint16x4_t a, uint16x4_t b); // VORN d0,d0,d0 -#define vorn_u16 vorn_s16 - -uint32x2_t vorn_u32(uint32x2_t a, uint32x2_t b); // VORN d0,d0,d0 -#define vorn_u32 vorn_s32 - -uint64x1_t vorn_u64(uint64x1_t a, uint64x1_t b); // VORN d0,d0,d0 -#define vorn_u64 vorn_s64 - - -int8x16_t vornq_s8(int8x16_t a, int8x16_t b); // VORN q0,q0,q0 -_NEON2SSE_INLINE int8x16_t vornq_s8(int8x16_t a, int8x16_t b) // VORN q0,q0,q0 -{ - __m128i b1; - b1 = vmvnq_s8( b); //bitwise not for b - return _mm_or_si128 (a, b1); -} - -int16x8_t vornq_s16(int16x8_t a, int16x8_t b); // VORN q0,q0,q0 -_NEON2SSE_INLINE int16x8_t vornq_s16(int16x8_t a, int16x8_t b) // VORN q0,q0,q0 -{ - __m128i b1; - b1 = vmvnq_s16( b); //bitwise not for b - return _mm_or_si128 (a, b1); -} - -int32x4_t vornq_s32(int32x4_t a, int32x4_t b); // VORN q0,q0,q0 -_NEON2SSE_INLINE int32x4_t vornq_s32(int32x4_t a, int32x4_t b) // VORN q0,q0,q0 -{ - __m128i b1; - b1 = vmvnq_s32( b); //bitwise not for b - return _mm_or_si128 (a, b1); -} - -int64x2_t vornq_s64(int64x2_t a, int64x2_t b); // VORN q0,q0,q0 -_NEON2SSE_INLINE int64x2_t vornq_s64(int64x2_t a, int64x2_t b) -{ - __m128i c1, b1; - c1 = _mm_cmpeq_epi8 (a, a); //all ones 0xfffffff...fffff - b1 = _mm_andnot_si128 (b, c1); - return _mm_or_si128 (a, b1); -} - -uint8x16_t vornq_u8(uint8x16_t a, uint8x16_t b); // VORN q0,q0,q0 -_NEON2SSE_INLINE uint8x16_t vornq_u8(uint8x16_t a, uint8x16_t b) // VORN q0,q0,q0 -{ - __m128i b1; - b1 = vmvnq_u8( b); //bitwise not for b - return _mm_or_si128 (a, b1); -} - -uint16x8_t vornq_u16(uint16x8_t a, uint16x8_t b); // VORN q0,q0,q0 -_NEON2SSE_INLINE uint16x8_t vornq_u16(uint16x8_t a, uint16x8_t b) // VORN q0,q0,q0 -{ - __m128i b1; - b1 = vmvnq_s16( b); //bitwise not for b - return _mm_or_si128 (a, b1); -} - -uint32x4_t vornq_u32(uint32x4_t a, uint32x4_t b); // VORN q0,q0,q0 -_NEON2SSE_INLINE uint32x4_t vornq_u32(uint32x4_t a, uint32x4_t b) // VORN q0,q0,q0 -{ - __m128i b1; - b1 = vmvnq_u32( b); //bitwise not for b - return _mm_or_si128 (a, b1); -} -uint64x2_t vornq_u64(uint64x2_t a, uint64x2_t b); // VORN q0,q0,q0 -#define vornq_u64 vornq_s64 - -//********************* Bitwise Select ***************************** -//****************************************************************** -//Note This intrinsic can compile to any of VBSL/VBIF/VBIT depending on register allocation.(?????????) - -//VBSL (Bitwise Select) selects each bit for the destination from the first operand if the -//corresponding bit of the destination is 1, or from the second operand if the corresponding bit of the destination is 0. - -//VBIF (Bitwise Insert if False) inserts each bit from the first operand into the destination -//if the corresponding bit of the second operand is 0, otherwise leaves the destination bit unchanged - -//VBIT (Bitwise Insert if True) inserts each bit from the first operand into the destination -//if the corresponding bit of the second operand is 1, otherwise leaves the destination bit unchanged. - -//VBSL only is implemented for SIMD -int8x8_t vbsl_s8(uint8x8_t a, int8x8_t b, int8x8_t c); // VBSL d0,d0,d0 -_NEON2SSE_INLINE int8x8_t vbsl_s8(uint8x8_t a, int8x8_t b, int8x8_t c) -{ - int8x8_t res64; - __m128i res; - res = vbslq_s8(_pM128i(a), _pM128i(b), _pM128i(c)); - return64(res); -} - -int16x4_t vbsl_s16(uint16x4_t a, int16x4_t b, int16x4_t c); // VBSL d0,d0,d0 -#define vbsl_s16 vbsl_s8 - -int32x2_t vbsl_s32(uint32x2_t a, int32x2_t b, int32x2_t c); // VBSL d0,d0,d0 -#define vbsl_s32 vbsl_s8 - -int64x1_t vbsl_s64(uint64x1_t a, int64x1_t b, int64x1_t c); // VBSL d0,d0,d0 -_NEON2SSE_INLINE int64x1_t vbsl_s64(uint64x1_t a, int64x1_t b, int64x1_t c) -{ - int64x1_t res; - res.m64_i64[0] = (a.m64_i64[0] & b.m64_i64[0]) | ( (~a.m64_i64[0]) & c.m64_i64[0]); - return res; -} - -uint8x8_t vbsl_u8(uint8x8_t a, uint8x8_t b, uint8x8_t c); // VBSL d0,d0,d0 -#define vbsl_u8 vbsl_s8 - -uint16x4_t vbsl_u16(uint16x4_t a, uint16x4_t b, uint16x4_t c); // VBSL d0,d0,d0 -#define vbsl_u16 vbsl_s8 - -uint32x2_t vbsl_u32(uint32x2_t a, uint32x2_t b, uint32x2_t c); // VBSL d0,d0,d0 -#define vbsl_u32 vbsl_s8 - -uint64x1_t vbsl_u64(uint64x1_t a, uint64x1_t b, uint64x1_t c); // VBSL d0,d0,d0 -#define vbsl_u64 vbsl_s64 - -float32x2_t vbsl_f32(uint32x2_t a, float32x2_t b, float32x2_t c); // VBSL d0,d0,d0 -_NEON2SSE_INLINE float32x2_t vbsl_f32(uint32x2_t a, float32x2_t b, float32x2_t c) -{ - __m128 sel1, sel2; - __m64_128 res64; - sel1 = _mm_and_ps (_pM128(a), _pM128(b)); - sel2 = _mm_andnot_ps (_pM128(a), _pM128(c)); - sel1 = _mm_or_ps (sel1, sel2); - _M64f(res64, sel1); - return res64; -} - -poly8x8_t vbsl_p8(uint8x8_t a, poly8x8_t b, poly8x8_t c); // VBSL d0,d0,d0 -#define vbsl_p8 vbsl_s8 - -poly16x4_t vbsl_p16(uint16x4_t a, poly16x4_t b, poly16x4_t c); // VBSL d0,d0,d0 -#define vbsl_p16 vbsl_s8 - -int8x16_t vbslq_s8(uint8x16_t a, int8x16_t b, int8x16_t c); // VBSL q0,q0,q0 -_NEON2SSE_INLINE int8x16_t vbslq_s8(uint8x16_t a, int8x16_t b, int8x16_t c) // VBSL q0,q0,q0 -{ - __m128i sel1, sel2; - sel1 = _mm_and_si128 (a, b); - sel2 = _mm_andnot_si128 (a, c); - return _mm_or_si128 (sel1, sel2); -} - -int16x8_t vbslq_s16(uint16x8_t a, int16x8_t b, int16x8_t c); // VBSL q0,q0,q0 -#define vbslq_s16 vbslq_s8 - -int32x4_t vbslq_s32(uint32x4_t a, int32x4_t b, int32x4_t c); // VBSL q0,q0,q0 -#define vbslq_s32 vbslq_s8 - -int64x2_t vbslq_s64(uint64x2_t a, int64x2_t b, int64x2_t c); // VBSL q0,q0,q0 -#define vbslq_s64 vbslq_s8 - -uint8x16_t vbslq_u8(uint8x16_t a, uint8x16_t b, uint8x16_t c); // VBSL q0,q0,q0 -#define vbslq_u8 vbslq_s8 - -uint16x8_t vbslq_u16(uint16x8_t a, uint16x8_t b, uint16x8_t c); // VBSL q0,q0,q0 -#define vbslq_u16 vbslq_s8 - -uint32x4_t vbslq_u32(uint32x4_t a, uint32x4_t b, uint32x4_t c); // VBSL q0,q0,q0 -#define vbslq_u32 vbslq_s8 - -uint64x2_t vbslq_u64(uint64x2_t a, uint64x2_t b, uint64x2_t c); // VBSL q0,q0,q0 -#define vbslq_u64 vbslq_s8 - -float32x4_t vbslq_f32(uint32x4_t a, float32x4_t b, float32x4_t c); // VBSL q0,q0,q0 -_NEON2SSE_INLINE float32x4_t vbslq_f32(uint32x4_t a, float32x4_t b, float32x4_t c) // VBSL q0,q0,q0 -{ - __m128 sel1, sel2; - sel1 = _mm_and_ps (*(__m128*)&a, b); - sel2 = _mm_andnot_ps (*(__m128*)&a, c); - return _mm_or_ps (sel1, sel2); -} - -poly8x16_t vbslq_p8(uint8x16_t a, poly8x16_t b, poly8x16_t c); // VBSL q0,q0,q0 -#define vbslq_p8 vbslq_u8 - -poly16x8_t vbslq_p16(uint16x8_t a, poly16x8_t b, poly16x8_t c); // VBSL q0,q0,q0 -#define vbslq_p16 vbslq_s8 - -//************************************************************************************ -//**************** Transposition operations **************************************** -//************************************************************************************ -//***************** Vector Transpose ************************************************ -//************************************************************************************ -//VTRN (Vector Transpose) treats the elements of its operand vectors as elements of 2 x 2 matrices, and transposes the matrices. -// making the result look as (a0, b0, a2, b2, a4, b4,....) (a1, b1, a3, b3, a5, b5,.....) -int8x8x2_t vtrn_s8(int8x8_t a, int8x8_t b); // VTRN.8 d0,d0 -_NEON2SSE_INLINE int8x8x2_t vtrn_s8(int8x8_t a, int8x8_t b) // VTRN.8 d0,d0 -{ - int8x8x2_t val; - __m128i tmp, val0; - _NEON2SSE_ALIGN_16 int8_t mask16_even_odd[16] = { 0,1, 4,5, 8,9, 12,13, 2,3, 6,7, 10,11, 14,15}; //mask8_trnsp - tmp = _mm_unpacklo_epi8(_pM128i(a), _pM128i(b)); //a0,b0,a1,b1,a2,b2,a3,b3,...,a7,b7 - val0 = _mm_shuffle_epi8 (tmp, *(__m128i*)mask16_even_odd); //(a0, b0, a2, b2, a4, b4, a6, b6), (a1,b1, a3,b3, a5,b5, a7,b7) - vst1q_s8 (val.val, val0); // _mm_shuffle_epi32 (val.val[0], _SWAP_HI_LOW32); //(a1,b1, a3,b3, a5,b5, a7,b7),(a0, b0, a2, b2, a4, b4, a6, b6), - return val; -} - -int16x4x2_t vtrn_s16(int16x4_t a, int16x4_t b); // VTRN.16 d0,d0 -_NEON2SSE_INLINE int16x4x2_t vtrn_s16(int16x4_t a, int16x4_t b) // VTRN.16 d0,d0 -{ - int16x4x2_t val; - __m128i tmp, val0; - _NEON2SSE_ALIGN_16 int8_t maskdlv16[16] = {0,1, 2,3, 8,9, 10,11, 4,5, 6,7, 12,13, 14, 15}; - tmp = _mm_unpacklo_epi16(_pM128i(a), _pM128i(b)); //a0,b0,a1,b1,a2,b2,a3,b3 - val0 = _mm_shuffle_epi8 (tmp, *(__m128i*)maskdlv16); //a0, b0, a2, b2, a1,b1, a3, b3 - vst1q_s16(val.val, val0); // _mm_shuffle_epi32 (val.val[0], _SWAP_HI_LOW32); //(a1,b1, a3,b3),(a0, b0, a2, b2), - return val; -} - -int32x2x2_t vtrn_s32(int32x2_t a, int32x2_t b); // VTRN.32 d0,d0 -_NEON2SSE_INLINE int32x2x2_t vtrn_s32(int32x2_t a, int32x2_t b) -{ - int32x2x2_t val; - __m128i val0; - val0 = _mm_unpacklo_epi32(_pM128i(a), _pM128i(b)); //a0,b0,a1,b1 - vst1q_s32(val.val, val0); // _mm_shuffle_epi32(val.val[0], _SWAP_HI_LOW32); //a1,b1, a0,b0, - return val; -} - -uint8x8x2_t vtrn_u8(uint8x8_t a, uint8x8_t b); // VTRN.8 d0,d0 -#define vtrn_u8 vtrn_s8 - -uint16x4x2_t vtrn_u16(uint16x4_t a, uint16x4_t b); // VTRN.16 d0,d0 -#define vtrn_u16 vtrn_s16 - -uint32x2x2_t vtrn_u32(uint32x2_t a, uint32x2_t b); // VTRN.32 d0,d0 -#define vtrn_u32 vtrn_s32 - -float32x2x2_t vtrn_f32(float32x2_t a, float32x2_t b); // VTRN.32 d0,d0 -_NEON2SSE_INLINE float32x2x2_t vtrn_f32(float32x2_t a, float32x2_t b) -{ - float32x2x2_t val; - val.val[0].m64_f32[0] = a.m64_f32[0]; - val.val[0].m64_f32[1] = b.m64_f32[0]; - val.val[1].m64_f32[0] = a.m64_f32[1]; - val.val[1].m64_f32[1] = b.m64_f32[1]; - return val; //a0,b0,a1,b1 -} - -poly8x8x2_t vtrn_p8(poly8x8_t a, poly8x8_t b); // VTRN.8 d0,d0 -#define vtrn_p8 vtrn_u8 - -poly16x4x2_t vtrn_p16(poly16x4_t a, poly16x4_t b); // VTRN.16 d0,d0 -#define vtrn_p16 vtrn_s16 - -//int8x16x2_t vtrnq_s8(int8x16_t a, int8x16_t b); // VTRN.8 q0,q0 -_NEON2SSE_INLINE int8x16x2_t vtrnq_s8(int8x16_t a, int8x16_t b) // VTRN.8 q0,q0 -{ - int8x16x2_t r8x16; - __m128i a_sh, b_sh; - _NEON2SSE_ALIGN_16 int8_t mask8_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3,5, 7, 9, 11, 13, 15}; - a_sh = _mm_shuffle_epi8 (a, *(__m128i*)mask8_even_odd); //a0, a2, a4, a6, a8, a10, a12, a14, a1, a3, a5, a7, a9, a11, a13, a15 - b_sh = _mm_shuffle_epi8 (b, *(__m128i*)mask8_even_odd); //b0, b2, b4, b6, b8, b10, b12, b14, b1, b3, b5, b7, b9, b11, b13, b15 - - r8x16.val[0] = _mm_unpacklo_epi8(a_sh, b_sh); //(a0, b0, a2, b2, a4, b4, a6, b6, a8,b8, a10,b10, a12,b12, a14,b14) - r8x16.val[1] = _mm_unpackhi_epi8(a_sh, b_sh); // (a1, b1, a3, b3, a5, b5, a7, b7, a9,b9, a11,b11, a13,b13, a15,b15) - return r8x16; -} - -int16x8x2_t vtrnq_s16(int16x8_t a, int16x8_t b); // VTRN.16 q0,q0 -_NEON2SSE_INLINE int16x8x2_t vtrnq_s16(int16x8_t a, int16x8_t b) // VTRN.16 q0,q0 -{ - int16x8x2_t v16x8; - __m128i a_sh, b_sh; - _NEON2SSE_ALIGN_16 int8_t mask16_even_odd[16] = { 0,1, 4,5, 8,9, 12,13, 2,3, 6,7, 10,11, 14,15}; - a_sh = _mm_shuffle_epi8 (a, *(__m128i*)mask16_even_odd); //a0, a2, a4, a6, a1, a3, a5, a7 - b_sh = _mm_shuffle_epi8 (b, *(__m128i*)mask16_even_odd); //b0, b2, b4, b6, b1, b3, b5, b7 - v16x8.val[0] = _mm_unpacklo_epi16(a_sh, b_sh); //a0, b0, a2, b2, a4, b4, a6, b6 - v16x8.val[1] = _mm_unpackhi_epi16(a_sh, b_sh); //a1, b1, a3, b3, a5, b5, a7, b7 - return v16x8; -} - -int32x4x2_t vtrnq_s32(int32x4_t a, int32x4_t b); // VTRN.32 q0,q0 -_NEON2SSE_INLINE int32x4x2_t vtrnq_s32(int32x4_t a, int32x4_t b) // VTRN.32 q0,q0 -{ - //may be not optimal solution compared with serial - int32x4x2_t v32x4; - __m128i a_sh, b_sh; - a_sh = _mm_shuffle_epi32 (a, 216); //a0, a2, a1, a3 - b_sh = _mm_shuffle_epi32 (b, 216); //b0, b2, b1, b3 - - v32x4.val[0] = _mm_unpacklo_epi32(a_sh, b_sh); //a0, b0, a2, b2 - v32x4.val[1] = _mm_unpackhi_epi32(a_sh, b_sh); //a1, b1, a3, b3 - return v32x4; -} - -uint8x16x2_t vtrnq_u8(uint8x16_t a, uint8x16_t b); // VTRN.8 q0,q0 -#define vtrnq_u8 vtrnq_s8 - -uint16x8x2_t vtrnq_u16(uint16x8_t a, uint16x8_t b); // VTRN.16 q0,q0 -#define vtrnq_u16 vtrnq_s16 - -uint32x4x2_t vtrnq_u32(uint32x4_t a, uint32x4_t b); // VTRN.32 q0,q0 -#define vtrnq_u32 vtrnq_s32 - -float32x4x2_t vtrnq_f32(float32x4_t a, float32x4_t b); // VTRN.32 q0,q0 -_NEON2SSE_INLINE float32x4x2_t vtrnq_f32(float32x4_t a, float32x4_t b) // VTRN.32 q0,q0 -{ - //may be not optimal solution compared with serial - float32x4x2_t f32x4; - __m128 a_sh, b_sh; - a_sh = _mm_shuffle_ps (a, a, _MM_SHUFFLE(3,1, 2, 0)); //a0, a2, a1, a3, need to check endiness - b_sh = _mm_shuffle_ps (b, b, _MM_SHUFFLE(3,1, 2, 0)); //b0, b2, b1, b3, need to check endiness - - f32x4.val[0] = _mm_unpacklo_ps(a_sh, b_sh); //a0, b0, a2, b2 - f32x4.val[1] = _mm_unpackhi_ps(a_sh, b_sh); //a1, b1, a3, b3 - return f32x4; -} - -poly8x16x2_t vtrnq_p8(poly8x16_t a, poly8x16_t b); // VTRN.8 q0,q0 -#define vtrnq_p8 vtrnq_s8 - -poly16x8x2_t vtrnq_p16(poly16x8_t a, poly16x8_t b); // VTRN.16 q0,q0 -#define vtrnq_p16 vtrnq_s16 - -//***************** Interleave elements *************************** -//***************************************************************** -//output has (a0,b0,a1,b1, a2,b2,.....) -int8x8x2_t vzip_s8(int8x8_t a, int8x8_t b); // VZIP.8 d0,d0 -_NEON2SSE_INLINE int8x8x2_t vzip_s8(int8x8_t a, int8x8_t b) // VZIP.8 d0,d0 -{ - int8x8x2_t val; - __m128i val0; - val0 = _mm_unpacklo_epi8(_pM128i(a), _pM128i(b)); - vst1q_s8(val.val, val0); //_mm_shuffle_epi32(val.val[0], _SWAP_HI_LOW32); - return val; -} - -int16x4x2_t vzip_s16(int16x4_t a, int16x4_t b); // VZIP.16 d0,d0 -_NEON2SSE_INLINE int16x4x2_t vzip_s16(int16x4_t a, int16x4_t b) // VZIP.16 d0,d0 -{ - int16x4x2_t val; - __m128i val0; - val0 = _mm_unpacklo_epi16(_pM128i(a), _pM128i(b)); - vst1q_s16(val.val, val0); // _mm_shuffle_epi32(val.val[0], _SWAP_HI_LOW32); - return val; -} - -int32x2x2_t vzip_s32(int32x2_t a, int32x2_t b); // VZIP.32 d0,d0 -#define vzip_s32 vtrn_s32 - -uint8x8x2_t vzip_u8(uint8x8_t a, uint8x8_t b); // VZIP.8 d0,d0 -#define vzip_u8 vzip_s8 - -uint16x4x2_t vzip_u16(uint16x4_t a, uint16x4_t b); // VZIP.16 d0,d0 -#define vzip_u16 vzip_s16 - -uint32x2x2_t vzip_u32(uint32x2_t a, uint32x2_t b); // VZIP.32 d0,d0 -#define vzip_u32 vzip_s32 - -float32x2x2_t vzip_f32(float32x2_t a, float32x2_t b); // VZIP.32 d0,d0 -#define vzip_f32 vtrn_f32 - -poly8x8x2_t vzip_p8(poly8x8_t a, poly8x8_t b); // VZIP.8 d0,d0 -#define vzip_p8 vzip_u8 - -poly16x4x2_t vzip_p16(poly16x4_t a, poly16x4_t b); // VZIP.16 d0,d0 -#define vzip_p16 vzip_u16 - -int8x16x2_t vzipq_s8(int8x16_t a, int8x16_t b); // VZIP.8 q0,q0 -_NEON2SSE_INLINE int8x16x2_t vzipq_s8(int8x16_t a, int8x16_t b) // VZIP.8 q0,q0 -{ - int8x16x2_t r8x16; - r8x16.val[0] = _mm_unpacklo_epi8(a, b); - r8x16.val[1] = _mm_unpackhi_epi8(a, b); - return r8x16; -} - -int16x8x2_t vzipq_s16(int16x8_t a, int16x8_t b); // VZIP.16 q0,q0 -_NEON2SSE_INLINE int16x8x2_t vzipq_s16(int16x8_t a, int16x8_t b) // VZIP.16 q0,q0 -{ - int16x8x2_t r16x8; - r16x8.val[0] = _mm_unpacklo_epi16(a, b); - r16x8.val[1] = _mm_unpackhi_epi16(a, b); - return r16x8; -} - -int32x4x2_t vzipq_s32(int32x4_t a, int32x4_t b); // VZIP.32 q0,q0 -_NEON2SSE_INLINE int32x4x2_t vzipq_s32(int32x4_t a, int32x4_t b) // VZIP.32 q0,q0 -{ - int32x4x2_t r32x4; - r32x4.val[0] = _mm_unpacklo_epi32(a, b); - r32x4.val[1] = _mm_unpackhi_epi32(a, b); - return r32x4; -} - -uint8x16x2_t vzipq_u8(uint8x16_t a, uint8x16_t b); // VZIP.8 q0,q0 -#define vzipq_u8 vzipq_s8 - -uint16x8x2_t vzipq_u16(uint16x8_t a, uint16x8_t b); // VZIP.16 q0,q0 -#define vzipq_u16 vzipq_s16 - -uint32x4x2_t vzipq_u32(uint32x4_t a, uint32x4_t b); // VZIP.32 q0,q0 -#define vzipq_u32 vzipq_s32 - -float32x4x2_t vzipq_f32(float32x4_t a, float32x4_t b); // VZIP.32 q0,q0 -_NEON2SSE_INLINE float32x4x2_t vzipq_f32(float32x4_t a, float32x4_t b) // VZIP.32 q0,q0 -{ - float32x4x2_t f32x4; - f32x4.val[0] = _mm_unpacklo_ps ( a, b); - f32x4.val[1] = _mm_unpackhi_ps ( a, b); - return f32x4; -} - -poly8x16x2_t vzipq_p8(poly8x16_t a, poly8x16_t b); // VZIP.8 q0,q0 -#define vzipq_p8 vzipq_u8 - -poly16x8x2_t vzipq_p16(poly16x8_t a, poly16x8_t b); // VZIP.16 q0,q0 -#define vzipq_p16 vzipq_u16 - -//*********************** De-Interleave elements ************************* -//************************************************************************* -//As the result of these functions first val contains (a0,a2,a4,....,b0,b2, b4,...) and the second val (a1,a3,a5,....b1,b3,b5...) -//no such functions in IA32 SIMD, shuffle is required -int8x8x2_t vuzp_s8(int8x8_t a, int8x8_t b); // VUZP.8 d0,d0 -_NEON2SSE_INLINE int8x8x2_t vuzp_s8(int8x8_t a, int8x8_t b) // VUZP.8 d0,d0 -{ - int8x8x2_t val; - __m128i tmp, val0; - _NEON2SSE_ALIGN_16 int8_t maskdlv8[16] = { 0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11,15}; - tmp = _mm_unpacklo_epi8(_pM128i(a), _pM128i(b)); //a0,b0,a1,b1,a2,b2,a3,b3,...,a7,b7 - val0 = _mm_shuffle_epi8 (tmp, *(__m128i*)maskdlv8); //(a0, a2, a4, a6, b0, b2, b4, b6), (a1, a3, a5, a7, b1,b3, b5, b7) - vst1q_s8(val.val, val0); // _mm_shuffle_epi32(val.val[0], _SWAP_HI_LOW32); - return val; -} - -int16x4x2_t vuzp_s16(int16x4_t a, int16x4_t b); // VUZP.16 d0,d0 -_NEON2SSE_INLINE int16x4x2_t vuzp_s16(int16x4_t a, int16x4_t b) // VUZP.16 d0,d0 -{ - int16x4x2_t val; - __m128i tmp, val0; - _NEON2SSE_ALIGN_16 int8_t maskdlv16[16] = {0,1, 8,9, 2,3, 10,11, 4,5, 12,13, 6,7, 14,15}; - tmp = _mm_unpacklo_epi16(_pM128i(a), _pM128i(b)); //a0,b0,a1,b1,a2,b2,a3,b3 - val0 = _mm_shuffle_epi8 (tmp, *(__m128i*)maskdlv16); //a0,a2, b0, b2, a1,a3, b1,b3 - vst1q_s16(val.val, val0); // _mm_shuffle_epi32(val.val[0], _SWAP_HI_LOW32); - return val; -} - -int32x2x2_t vuzp_s32(int32x2_t a, int32x2_t b); // VUZP.32 d0,d0 -_NEON2SSE_INLINE int32x2x2_t vuzp_s32(int32x2_t a, int32x2_t b) // VUZP.32 d0,d0 -{ - int32x2x2_t val; - __m128i val0; - val0 = _mm_unpacklo_epi32(_pM128i(a), _pM128i(b)); //a0,b0, a1,b1 - vst1q_s32(val.val, val0); // _mm_shuffle_epi32(val.val[0], _SWAP_HI_LOW32); - return val; -} - -uint8x8x2_t vuzp_u8(uint8x8_t a, uint8x8_t b); // VUZP.8 d0,d0 -#define vuzp_u8 vuzp_s8 - -uint16x4x2_t vuzp_u16(uint16x4_t a, uint16x4_t b); // VUZP.16 d0,d0 -#define vuzp_u16 vuzp_s16 - -uint32x2x2_t vuzp_u32(uint32x2_t a, uint32x2_t b); // VUZP.32 d0,d0 -#define vuzp_u32 vuzp_s32 - -float32x2x2_t vuzp_f32(float32x2_t a, float32x2_t b); // VUZP.32 d0,d0 -#define vuzp_f32 vzip_f32 - -poly8x8x2_t vuzp_p8(poly8x8_t a, poly8x8_t b); // VUZP.8 d0,d0 -#define vuzp_p8 vuzp_u8 - -poly16x4x2_t vuzp_p16(poly16x4_t a, poly16x4_t b); // VUZP.16 d0,d0 -#define vuzp_p16 vuzp_u16 - -int8x16x2_t vuzpq_s8(int8x16_t a, int8x16_t b); // VUZP.8 q0,q0 -_NEON2SSE_INLINE int8x16x2_t vuzpq_s8(int8x16_t a, int8x16_t b) // VUZP.8 q0,q0 -{ - int8x16x2_t v8x16; - __m128i a_sh, b_sh; - _NEON2SSE_ALIGN_16 int8_t mask8_even_odd[16] = { 0, 2, 4, 6, 8, 10, 12, 14, 1, 3,5, 7, 9, 11, 13, 15}; - a_sh = _mm_shuffle_epi8 (a, *(__m128i*)mask8_even_odd); //a0, a2, a4, a6, a8, a10, a12, a14, a1, a3, a5, a7, a9, a11, a13, a15 - b_sh = _mm_shuffle_epi8 (b, *(__m128i*)mask8_even_odd); //b0, b2, b4, b6, b8, b10, b12, b14, b1, b3, b5, b7, b9, b11, b13, b15 - //we need unpack64 to combine lower (upper) 64 bits from a with lower (upper) 64 bits from b - v8x16.val[0] = _mm_unpacklo_epi64(a_sh, b_sh); ///a0, a2, a4, a6, a8, a10, a12, a14, b0, b2, b4, b6, b8, b10, b12, b14, - v8x16.val[1] = _mm_unpackhi_epi64(a_sh, b_sh); //a1, a3, a5, a7, a9, a11, a13, a15, b1, b3, b5, b7, b9, b11, b13, b15 - return v8x16; -} - -int16x8x2_t vuzpq_s16(int16x8_t a, int16x8_t b); // VUZP.16 q0,q0 -_NEON2SSE_INLINE int16x8x2_t vuzpq_s16(int16x8_t a, int16x8_t b) // VUZP.16 q0,q0 -{ - int16x8x2_t v16x8; - __m128i a_sh, b_sh; - _NEON2SSE_ALIGN_16 int8_t mask16_even_odd[16] = { 0,1, 4,5, 8,9, 12,13, 2,3, 6,7, 10,11, 14,15}; - a_sh = _mm_shuffle_epi8 (a, *(__m128i*)mask16_even_odd); //a0, a2, a4, a6, a1, a3, a5, a7 - b_sh = _mm_shuffle_epi8 (b, *(__m128i*)mask16_even_odd); //b0, b2, b4, b6, b1, b3, b5, b7 - v16x8.val[0] = _mm_unpacklo_epi64(a_sh, b_sh); //a0, a2, a4, a6, b0, b2, b4, b6 - v16x8.val[1] = _mm_unpackhi_epi64(a_sh, b_sh); //a1, a3, a5, a7, b1, b3, b5, b7 - return v16x8; -} - -int32x4x2_t vuzpq_s32(int32x4_t a, int32x4_t b); // VUZP.32 q0,q0 -_NEON2SSE_INLINE int32x4x2_t vuzpq_s32(int32x4_t a, int32x4_t b) // VUZP.32 q0,q0 -{ - //may be not optimal solution compared with serial - int32x4x2_t v32x4; - __m128i a_sh, b_sh; - a_sh = _mm_shuffle_epi32 (a, 216); //a0, a2, a1, a3 - b_sh = _mm_shuffle_epi32 (b, 216); //b0, b2, b1, b3 - - v32x4.val[0] = _mm_unpacklo_epi64(a_sh, b_sh); //a0, a2, b0, b2 - v32x4.val[1] = _mm_unpackhi_epi64(a_sh, b_sh); //a1, a3, b1, b3 - return v32x4; -} - -uint8x16x2_t vuzpq_u8(uint8x16_t a, uint8x16_t b); // VUZP.8 q0,q0 -#define vuzpq_u8 vuzpq_s8 - -uint16x8x2_t vuzpq_u16(uint16x8_t a, uint16x8_t b); // VUZP.16 q0,q0 -#define vuzpq_u16 vuzpq_s16 - -uint32x4x2_t vuzpq_u32(uint32x4_t a, uint32x4_t b); // VUZP.32 q0,q0 -#define vuzpq_u32 vuzpq_s32 - -float32x4x2_t vuzpq_f32(float32x4_t a, float32x4_t b); // VUZP.32 q0,q0 -_NEON2SSE_INLINE float32x4x2_t vuzpq_f32(float32x4_t a, float32x4_t b) // VUZP.32 q0,q0 -{ - float32x4x2_t v32x4; - v32x4.val[0] = _mm_shuffle_ps(a, b, _MM_SHUFFLE(2,0, 2, 0)); //a0, a2, b0, b2 , need to check endianess however - v32x4.val[1] = _mm_shuffle_ps(a, b, _MM_SHUFFLE(3,1, 3, 1)); //a1, a3, b1, b3, need to check endianess however - return v32x4; -} - -poly8x16x2_t vuzpq_p8(poly8x16_t a, poly8x16_t b); // VUZP.8 q0,q0 -#define vuzpq_p8 vuzpq_u8 - -poly16x8x2_t vuzpq_p16(poly16x8_t a, poly16x8_t b); // VUZP.16 q0,q0 -#define vuzpq_p16 vuzpq_u16 - -//############################################################################################## -//*********************** Reinterpret cast intrinsics.****************************************** -//############################################################################################## -// Not a part of oficial NEON instruction set but available in gcc compiler ********************* -poly8x8_t vreinterpret_p8_u32 (uint32x2_t t); -#define vreinterpret_p8_u32 - -poly8x8_t vreinterpret_p8_u16 (uint16x4_t t); -#define vreinterpret_p8_u16 - -poly8x8_t vreinterpret_p8_u8 (uint8x8_t t); -#define vreinterpret_p8_u8 - -poly8x8_t vreinterpret_p8_s32 (int32x2_t t); -#define vreinterpret_p8_s32 - -poly8x8_t vreinterpret_p8_s16 (int16x4_t t); -#define vreinterpret_p8_s16 - -poly8x8_t vreinterpret_p8_s8 (int8x8_t t); -#define vreinterpret_p8_s8 - -poly8x8_t vreinterpret_p8_u64 (uint64x1_t t); -#define vreinterpret_p8_u64 - -poly8x8_t vreinterpret_p8_s64 (int64x1_t t); -#define vreinterpret_p8_s64 - -poly8x8_t vreinterpret_p8_f32 (float32x2_t t); -#define vreinterpret_p8_f32 - -poly8x8_t vreinterpret_p8_p16 (poly16x4_t t); -#define vreinterpret_p8_p16 - -poly8x16_t vreinterpretq_p8_u32 (uint32x4_t t); -#define vreinterpretq_p8_u32 - -poly8x16_t vreinterpretq_p8_u16 (uint16x8_t t); -#define vreinterpretq_p8_u16 - -poly8x16_t vreinterpretq_p8_u8 (uint8x16_t t); -#define vreinterpretq_p8_u8 - -poly8x16_t vreinterpretq_p8_s32 (int32x4_t t); -#define vreinterpretq_p8_s32 - -poly8x16_t vreinterpretq_p8_s16 (int16x8_t t); -#define vreinterpretq_p8_s16 - -poly8x16_t vreinterpretq_p8_s8 (int8x16_t t); -#define vreinterpretq_p8_s8 - -poly8x16_t vreinterpretq_p8_u64 (uint64x2_t t); -#define vreinterpretq_p8_u64 - -poly8x16_t vreinterpretq_p8_s64 (int64x2_t t); -#define vreinterpretq_p8_s64 - -poly8x16_t vreinterpretq_p8_f32 (float32x4_t t); -#define vreinterpretq_p8_f32(t) _M128i(t) - -poly8x16_t vreinterpretq_p8_p16 (poly16x8_t t); -#define vreinterpretq_p8_p16 - -poly16x4_t vreinterpret_p16_u32 (uint32x2_t t); -#define vreinterpret_p16_u32 - -poly16x4_t vreinterpret_p16_u16 (uint16x4_t t); -#define vreinterpret_p16_u16 - -poly16x4_t vreinterpret_p16_u8 (uint8x8_t t); -#define vreinterpret_p16_u8 - -poly16x4_t vreinterpret_p16_s32 (int32x2_t t); -#define vreinterpret_p16_s32 - -poly16x4_t vreinterpret_p16_s16 (int16x4_t t); -#define vreinterpret_p16_s16 - -poly16x4_t vreinterpret_p16_s8 (int8x8_t t); -#define vreinterpret_p16_s8 - -poly16x4_t vreinterpret_p16_u64 (uint64x1_t t); -#define vreinterpret_p16_u64 - -poly16x4_t vreinterpret_p16_s64 (int64x1_t t); -#define vreinterpret_p16_s64 - -poly16x4_t vreinterpret_p16_f32 (float32x2_t t); -#define vreinterpret_p16_f32 - -poly16x4_t vreinterpret_p16_p8 (poly8x8_t t); -#define vreinterpret_p16_p8 - -poly16x8_t vreinterpretq_p16_u32 (uint32x4_t t); -#define vreinterpretq_p16_u32 - -poly16x8_t vreinterpretq_p16_u16 (uint16x8_t t); -#define vreinterpretq_p16_u16 - -poly16x8_t vreinterpretq_p16_s32 (int32x4_t t); -#define vreinterpretq_p16_s32 - -poly16x8_t vreinterpretq_p16_s16 (int16x8_t t); -#define vreinterpretq_p16_s16 - -poly16x8_t vreinterpretq_p16_s8 (int8x16_t t); -#define vreinterpretq_p16_s8 - -poly16x8_t vreinterpretq_p16_u64 (uint64x2_t t); -#define vreinterpretq_p16_u64 - -poly16x8_t vreinterpretq_p16_s64 (int64x2_t t); -#define vreinterpretq_p16_s64 - -poly16x8_t vreinterpretq_p16_f32 (float32x4_t t); -#define vreinterpretq_p16_f32(t) _M128i(t) - -poly16x8_t vreinterpretq_p16_p8 (poly8x16_t t); -#define vreinterpretq_p16_p8 vreinterpretq_s16_p8 - -//**** Integer to float ****** -float32x2_t vreinterpret_f32_u32 (uint32x2_t t); -#define vreinterpret_f32_u32(t) (*(__m64_128*)&(t)) - - -float32x2_t vreinterpret_f32_u16 (uint16x4_t t); -#define vreinterpret_f32_u16 vreinterpret_f32_u32 - - -float32x2_t vreinterpret_f32_u8 (uint8x8_t t); -#define vreinterpret_f32_u8 vreinterpret_f32_u32 - - -float32x2_t vreinterpret_f32_s32 (int32x2_t t); -#define vreinterpret_f32_s32 vreinterpret_f32_u32 - - -float32x2_t vreinterpret_f32_s16 (int16x4_t t); -#define vreinterpret_f32_s16 vreinterpret_f32_u32 - -float32x2_t vreinterpret_f32_s8 (int8x8_t t); -#define vreinterpret_f32_s8 vreinterpret_f32_u32 - - -float32x2_t vreinterpret_f32_u64(uint64x1_t t); -#define vreinterpret_f32_u64 vreinterpret_f32_u32 - - -float32x2_t vreinterpret_f32_s64 (int64x1_t t); -#define vreinterpret_f32_s64 vreinterpret_f32_u32 - - -float32x2_t vreinterpret_f32_p16 (poly16x4_t t); -#define vreinterpret_f32_p16 vreinterpret_f32_u32 - -float32x2_t vreinterpret_f32_p8 (poly8x8_t t); -#define vreinterpret_f32_p8 vreinterpret_f32_u32 - -float32x4_t vreinterpretq_f32_u32 (uint32x4_t t); -#define vreinterpretq_f32_u32(t) *(__m128*)&(t) - -float32x4_t vreinterpretq_f32_u16 (uint16x8_t t); -#define vreinterpretq_f32_u16 vreinterpretq_f32_u32 - -float32x4_t vreinterpretq_f32_u8 (uint8x16_t t); -#define vreinterpretq_f32_u8 vreinterpretq_f32_u32 - -float32x4_t vreinterpretq_f32_s32 (int32x4_t t); -#define vreinterpretq_f32_s32 vreinterpretq_f32_u32 - -float32x4_t vreinterpretq_f32_s16 (int16x8_t t); -#define vreinterpretq_f32_s16 vreinterpretq_f32_u32 - -float32x4_t vreinterpretq_f32_s8 (int8x16_t t); -#define vreinterpretq_f32_s8 vreinterpretq_f32_u32 - -float32x4_t vreinterpretq_f32_u64 (uint64x2_t t); -#define vreinterpretq_f32_u64 vreinterpretq_f32_u32 - -float32x4_t vreinterpretq_f32_s64 (int64x2_t t); -#define vreinterpretq_f32_s64 vreinterpretq_f32_u32 - -float32x4_t vreinterpretq_f32_p16 (poly16x8_t t); -#define vreinterpretq_f32_p16 vreinterpretq_f32_u32 - -float32x4_t vreinterpretq_f32_p8 (poly8x16_t t); -#define vreinterpretq_f32_p8 vreinterpretq_f32_u32 - -//*** Integer type conversions ****************** -//no conversion necessary for the following functions because it is same data type -int64x1_t vreinterpret_s64_u32 (uint32x2_t t); -#define vreinterpret_s64_u32 - -int64x1_t vreinterpret_s64_u16 (uint16x4_t t); -#define vreinterpret_s64_u16 - -int64x1_t vreinterpret_s64_u8 (uint8x8_t t); -#define vreinterpret_s64_u8 - -int64x1_t vreinterpret_s64_s32 (int32x2_t t); -#define vreinterpret_s64_s32 - -int64x1_t vreinterpret_s64_s16 (int16x4_t t); -#define vreinterpret_s64_s16 - -int64x1_t vreinterpret_s64_s8 (int8x8_t t); -#define vreinterpret_s64_s8 - -int64x1_t vreinterpret_s64_u64 (uint64x1_t t); -#define vreinterpret_s64_u64 - -int64x1_t vreinterpret_s64_f32 (float32x2_t t); -#define vreinterpret_s64_f32 - -int64x1_t vreinterpret_s64_p16 (poly16x4_t t); -#define vreinterpret_s64_p16 - -int64x1_t vreinterpret_s64_p8 (poly8x8_t t); -#define vreinterpret_s64_p8 - -int64x2_t vreinterpretq_s64_u32 (uint32x4_t t); -#define vreinterpretq_s64_u32 - -int64x2_t vreinterpretq_s64_s16 (uint16x8_t t); -#define vreinterpretq_s64_s16 - -int64x2_t vreinterpretq_s64_u8 (uint8x16_t t); -#define vreinterpretq_s64_u8 - -int64x2_t vreinterpretq_s64_s32 (int32x4_t t); -#define vreinterpretq_s64_s32 - -int64x2_t vreinterpretq_s64_u16 (int16x8_t t); -#define vreinterpretq_s64_u16 - -int64x2_t vreinterpretq_s64_s8 (int8x16_t t); -#define vreinterpretq_s64_s8 - -int64x2_t vreinterpretq_s64_u64 (uint64x2_t t); -#define vreinterpretq_s64_u64 - -int64x2_t vreinterpretq_s64_f32 (float32x4_t t); -#define vreinterpretq_s64_f32(t) _M128i(t) - -int64x2_t vreinterpretq_s64_p16 (poly16x8_t t); -#define vreinterpretq_s64_p16 - -int64x2_t vreinterpretq_s64_p8 (poly8x16_t t); -#define vreinterpretq_s64_p8 - -uint64x1_t vreinterpret_u64_u32 (uint32x2_t t); -#define vreinterpret_u64_u32 - -uint64x1_t vreinterpret_u64_u16 (uint16x4_t t); -#define vreinterpret_u64_u16 - -uint64x1_t vreinterpret_u64_u8 (uint8x8_t t); -#define vreinterpret_u64_u8 - -uint64x1_t vreinterpret_u64_s32 (int32x2_t t); -#define vreinterpret_u64_s32 - -uint64x1_t vreinterpret_u64_s16 (int16x4_t t); -#define vreinterpret_u64_s16 - -uint64x1_t vreinterpret_u64_s8 (int8x8_t t); -#define vreinterpret_u64_s8 - -uint64x1_t vreinterpret_u64_s64 (int64x1_t t); -#define vreinterpret_u64_s64 - -uint64x1_t vreinterpret_u64_f32 (float32x2_t t); -#define vreinterpret_u64_f32 - -uint64x1_t vreinterpret_u64_p16 (poly16x4_t t); -#define vreinterpret_u64_p16 - -uint64x1_t vreinterpret_u64_p8 (poly8x8_t t); -#define vreinterpret_u64_p8 - -uint64x2_t vreinterpretq_u64_u32 (uint32x4_t t); -#define vreinterpretq_u64_u32 - -uint64x2_t vreinterpretq_u64_u16 (uint16x8_t t); -#define vreinterpretq_u64_u16 - -uint64x2_t vreinterpretq_u64_u8 (uint8x16_t t); -#define vreinterpretq_u64_u8 - -uint64x2_t vreinterpretq_u64_s32 (int32x4_t t); -#define vreinterpretq_u64_s32 - -uint64x2_t vreinterpretq_u64_s16 (int16x8_t t); -#define vreinterpretq_u64_s16 - -uint64x2_t vreinterpretq_u64_s8 (int8x16_t t); -#define vreinterpretq_u64_s8 - -uint64x2_t vreinterpretq_u64_s64 (int64x2_t t); -#define vreinterpretq_u64_s64 - -uint64x2_t vreinterpretq_u64_f32 (float32x4_t t); -#define vreinterpretq_u64_f32(t) _M128i(t) - -uint64x2_t vreinterpretq_u64_p16 (poly16x8_t t); -#define vreinterpretq_u64_p16 - -uint64x2_t vreinterpretq_u64_p8 (poly8x16_t t); -#define vreinterpretq_u64_p8 - -int8x8_t vreinterpret_s8_u32 (uint32x2_t t); -#define vreinterpret_s8_u32 - -int8x8_t vreinterpret_s8_u16 (uint16x4_t t); -#define vreinterpret_s8_u16 - -int8x8_t vreinterpret_s8_u8 (uint8x8_t t); -#define vreinterpret_s8_u8 - -int8x8_t vreinterpret_s8_s32 (int32x2_t t); -#define vreinterpret_s8_s32 - -int8x8_t vreinterpret_s8_s16 (int16x4_t t); -#define vreinterpret_s8_s16 - -int8x8_t vreinterpret_s8_u64 (uint64x1_t t); -#define vreinterpret_s8_u64 - -int8x8_t vreinterpret_s8_s64 (int64x1_t t); -#define vreinterpret_s8_s64 - -int8x8_t vreinterpret_s8_f32 (float32x2_t t); -#define vreinterpret_s8_f32 - -int8x8_t vreinterpret_s8_p16 (poly16x4_t t); -#define vreinterpret_s8_p16 - -int8x8_t vreinterpret_s8_p8 (poly8x8_t t); -#define vreinterpret_s8_p8 - -int8x16_t vreinterpretq_s8_u32 (uint32x4_t t); -#define vreinterpretq_s8_u32 - -int8x16_t vreinterpretq_s8_u16 (uint16x8_t t); -#define vreinterpretq_s8_u16 - -int8x16_t vreinterpretq_s8_u8 (uint8x16_t t); -#define vreinterpretq_s8_u8 - -int8x16_t vreinterpretq_s8_s32 (int32x4_t t); -#define vreinterpretq_s8_s32 - -int8x16_t vreinterpretq_s8_s16 (int16x8_t t); -#define vreinterpretq_s8_s16 - -int8x16_t vreinterpretq_s8_u64 (uint64x2_t t); -#define vreinterpretq_s8_u64 - -int8x16_t vreinterpretq_s8_s64 (int64x2_t t); -#define vreinterpretq_s8_s64 - -int8x16_t vreinterpretq_s8_f32 (float32x4_t t); -#define vreinterpretq_s8_f32(t) _M128i(t) - -int8x16_t vreinterpretq_s8_p16 (poly16x8_t t); -#define vreinterpretq_s8_p16 - -int8x16_t vreinterpretq_s8_p8 (poly8x16_t t); -#define vreinterpretq_s8_p8 - -int16x4_t vreinterpret_s16_u32 (uint32x2_t t); -#define vreinterpret_s16_u32 - -int16x4_t vreinterpret_s16_u16 (uint16x4_t t); -#define vreinterpret_s16_u16 - -int16x4_t vreinterpret_s16_u8 (uint8x8_t t); -#define vreinterpret_s16_u8 - -int16x4_t vreinterpret_s16_s32 (int32x2_t t); -#define vreinterpret_s16_s32 - -int16x4_t vreinterpret_s16_s8 (int8x8_t t); -#define vreinterpret_s16_s8 - -int16x4_t vreinterpret_s16_u64 (uint64x1_t t); -#define vreinterpret_s16_u64 - -int16x4_t vreinterpret_s16_s64 (int64x1_t t); -#define vreinterpret_s16_s64 - -int16x4_t vreinterpret_s16_f32 (float32x2_t t); -#define vreinterpret_s16_f32 - - -int16x4_t vreinterpret_s16_p16 (poly16x4_t t); -#define vreinterpret_s16_p16 - -int16x4_t vreinterpret_s16_p8 (poly8x8_t t); -#define vreinterpret_s16_p8 - -int16x8_t vreinterpretq_s16_u32 (uint32x4_t t); -#define vreinterpretq_s16_u32 - -int16x8_t vreinterpretq_s16_u16 (uint16x8_t t); -#define vreinterpretq_s16_u16 - -int16x8_t vreinterpretq_s16_u8 (uint8x16_t t); -#define vreinterpretq_s16_u8 - -int16x8_t vreinterpretq_s16_s32 (int32x4_t t); -#define vreinterpretq_s16_s32 - -int16x8_t vreinterpretq_s16_s8 (int8x16_t t); -#define vreinterpretq_s16_s8 - -int16x8_t vreinterpretq_s16_u64 (uint64x2_t t); -#define vreinterpretq_s16_u64 - -int16x8_t vreinterpretq_s16_s64 (int64x2_t t); -#define vreinterpretq_s16_s64 - -int16x8_t vreinterpretq_s16_f32 (float32x4_t t); -#define vreinterpretq_s16_f32(t) _M128i(t) - -int16x8_t vreinterpretq_s16_p16 (poly16x8_t t); -#define vreinterpretq_s16_p16 - -int16x8_t vreinterpretq_s16_p8 (poly8x16_t t); -#define vreinterpretq_s16_p8 - -int32x2_t vreinterpret_s32_u32 (uint32x2_t t); -#define vreinterpret_s32_u32 - -int32x2_t vreinterpret_s32_u16 (uint16x4_t t); -#define vreinterpret_s32_u16 - -int32x2_t vreinterpret_s32_u8 (uint8x8_t t); -#define vreinterpret_s32_u8 - -int32x2_t vreinterpret_s32_s16 (int16x4_t t); -#define vreinterpret_s32_s16 - -int32x2_t vreinterpret_s32_s8 (int8x8_t t); -#define vreinterpret_s32_s8 - -int32x2_t vreinterpret_s32_u64 (uint64x1_t t); -#define vreinterpret_s32_u64 - -int32x2_t vreinterpret_s32_s64 (int64x1_t t); -#define vreinterpret_s32_s64 - -int32x2_t vreinterpret_s32_f32 (float32x2_t t); -#define vreinterpret_s32_f32 - -int32x2_t vreinterpret_s32_p16 (poly16x4_t t); -#define vreinterpret_s32_p16 - -int32x2_t vreinterpret_s32_p8 (poly8x8_t t); -#define vreinterpret_s32_p8 - -int32x4_t vreinterpretq_s32_u32 (uint32x4_t t); -#define vreinterpretq_s32_u32 - -int32x4_t vreinterpretq_s32_u16 (uint16x8_t t); -#define vreinterpretq_s32_u16 - -int32x4_t vreinterpretq_s32_u8 (uint8x16_t t); -#define vreinterpretq_s32_u8 - -int32x4_t vreinterpretq_s32_s16 (int16x8_t t); -#define vreinterpretq_s32_s16 - -int32x4_t vreinterpretq_s32_s8 (int8x16_t t); -#define vreinterpretq_s32_s8 - -int32x4_t vreinterpretq_s32_u64 (uint64x2_t t); -#define vreinterpretq_s32_u64 - -int32x4_t vreinterpretq_s32_s64 (int64x2_t t); -#define vreinterpretq_s32_s64 - -int32x4_t vreinterpretq_s32_f32 (float32x4_t t); -#define vreinterpretq_s32_f32(t) _mm_castps_si128(t) //(*(__m128i*)&(t)) - -int32x4_t vreinterpretq_s32_p16 (poly16x8_t t); -#define vreinterpretq_s32_p16 - -int32x4_t vreinterpretq_s32_p8 (poly8x16_t t); -#define vreinterpretq_s32_p8 - -uint8x8_t vreinterpret_u8_u32 (uint32x2_t t); -#define vreinterpret_u8_u32 - -uint8x8_t vreinterpret_u8_u16 (uint16x4_t t); -#define vreinterpret_u8_u16 - -uint8x8_t vreinterpret_u8_s32 (int32x2_t t); -#define vreinterpret_u8_s32 - -uint8x8_t vreinterpret_u8_s16 (int16x4_t t); -#define vreinterpret_u8_s16 - -uint8x8_t vreinterpret_u8_s8 (int8x8_t t); -#define vreinterpret_u8_s8 - -uint8x8_t vreinterpret_u8_u64 (uint64x1_t t); -#define vreinterpret_u8_u64 - -uint8x8_t vreinterpret_u8_s64 (int64x1_t t); -#define vreinterpret_u8_s64 - -uint8x8_t vreinterpret_u8_f32 (float32x2_t t); -#define vreinterpret_u8_f32 - -uint8x8_t vreinterpret_u8_p16 (poly16x4_t t); -#define vreinterpret_u8_p16 - -uint8x8_t vreinterpret_u8_p8 (poly8x8_t t); -#define vreinterpret_u8_p8 - -uint8x16_t vreinterpretq_u8_u32 (uint32x4_t t); -#define vreinterpretq_u8_u32 - -uint8x16_t vreinterpretq_u8_u16 (uint16x8_t t); -#define vreinterpretq_u8_u16 - -uint8x16_t vreinterpretq_u8_s32 (int32x4_t t); -#define vreinterpretq_u8_s32 - -uint8x16_t vreinterpretq_u8_s16 (int16x8_t t); -#define vreinterpretq_u8_s16 - -uint8x16_t vreinterpretq_u8_s8 (int8x16_t t); -#define vreinterpretq_u8_s8 - -uint8x16_t vreinterpretq_u8_u64 (uint64x2_t t); -#define vreinterpretq_u8_u64 - -uint8x16_t vreinterpretq_u8_s64 (int64x2_t t); -#define vreinterpretq_u8_s64 - -uint8x16_t vreinterpretq_u8_f32 (float32x4_t t); -#define vreinterpretq_u8_f32(t) _M128i(t) - - -uint8x16_t vreinterpretq_u8_p16 (poly16x8_t t); -#define vreinterpretq_u8_p16 - -uint8x16_t vreinterpretq_u8_p8 (poly8x16_t t); -#define vreinterpretq_u8_p8 - -uint16x4_t vreinterpret_u16_u32 (uint32x2_t t); -#define vreinterpret_u16_u32 - -uint16x4_t vreinterpret_u16_u8 (uint8x8_t t); -#define vreinterpret_u16_u8 - -uint16x4_t vreinterpret_u16_s32 (int32x2_t t); -#define vreinterpret_u16_s32 - -uint16x4_t vreinterpret_u16_s16 (int16x4_t t); -#define vreinterpret_u16_s16 - -uint16x4_t vreinterpret_u16_s8 (int8x8_t t); -#define vreinterpret_u16_s8 - -uint16x4_t vreinterpret_u16_u64 (uint64x1_t t); -#define vreinterpret_u16_u64 - -uint16x4_t vreinterpret_u16_s64 (int64x1_t t); -#define vreinterpret_u16_s64 - -uint16x4_t vreinterpret_u16_f32 (float32x2_t t); -#define vreinterpret_u16_f32 - -uint16x4_t vreinterpret_u16_p16 (poly16x4_t t); -#define vreinterpret_u16_p16 - -uint16x4_t vreinterpret_u16_p8 (poly8x8_t t); -#define vreinterpret_u16_p8 - -uint16x8_t vreinterpretq_u16_u32 (uint32x4_t t); -#define vreinterpretq_u16_u32 - -uint16x8_t vreinterpretq_u16_u8 (uint8x16_t t); -#define vreinterpretq_u16_u8 - -uint16x8_t vreinterpretq_u16_s32 (int32x4_t t); -#define vreinterpretq_u16_s32 - -uint16x8_t vreinterpretq_u16_s16 (int16x8_t t); -#define vreinterpretq_u16_s16 - -uint16x8_t vreinterpretq_u16_s8 (int8x16_t t); -#define vreinterpretq_u16_s8 - -uint16x8_t vreinterpretq_u16_u64 (uint64x2_t t); -#define vreinterpretq_u16_u64 - -uint16x8_t vreinterpretq_u16_s64 (int64x2_t t); -#define vreinterpretq_u16_s64 - -uint16x8_t vreinterpretq_u16_f32 (float32x4_t t); -#define vreinterpretq_u16_f32(t) _M128i(t) - -uint16x8_t vreinterpretq_u16_p16 (poly16x8_t t); -#define vreinterpretq_u16_p16 - -uint16x8_t vreinterpretq_u16_p8 (poly8x16_t t); -#define vreinterpretq_u16_p8 - -uint32x2_t vreinterpret_u32_u16 (uint16x4_t t); -#define vreinterpret_u32_u16 - -uint32x2_t vreinterpret_u32_u8 (uint8x8_t t); -#define vreinterpret_u32_u8 - -uint32x2_t vreinterpret_u32_s32 (int32x2_t t); -#define vreinterpret_u32_s32 - -uint32x2_t vreinterpret_u32_s16 (int16x4_t t); -#define vreinterpret_u32_s16 - -uint32x2_t vreinterpret_u32_s8 (int8x8_t t); -#define vreinterpret_u32_s8 - -uint32x2_t vreinterpret_u32_u64 (uint64x1_t t); -#define vreinterpret_u32_u64 - -uint32x2_t vreinterpret_u32_s64 (int64x1_t t); -#define vreinterpret_u32_s64 - -uint32x2_t vreinterpret_u32_f32 (float32x2_t t); -#define vreinterpret_u32_f32 - -uint32x2_t vreinterpret_u32_p16 (poly16x4_t t); -#define vreinterpret_u32_p16 - -uint32x2_t vreinterpret_u32_p8 (poly8x8_t t); -#define vreinterpret_u32_p8 - -uint32x4_t vreinterpretq_u32_u16 (uint16x8_t t); -#define vreinterpretq_u32_u16 - -uint32x4_t vreinterpretq_u32_u8 (uint8x16_t t); -#define vreinterpretq_u32_u8 - -uint32x4_t vreinterpretq_u32_s32 (int32x4_t t); -#define vreinterpretq_u32_s32 - -uint32x4_t vreinterpretq_u32_s16 (int16x8_t t); -#define vreinterpretq_u32_s16 - -uint32x4_t vreinterpretq_u32_s8 (int8x16_t t); -#define vreinterpretq_u32_s8 - -uint32x4_t vreinterpretq_u32_u64 (uint64x2_t t); -#define vreinterpretq_u32_u64 - -uint32x4_t vreinterpretq_u32_s64 (int64x2_t t); -#define vreinterpretq_u32_s64 - -uint32x4_t vreinterpretq_u32_f32 (float32x4_t t); -#define vreinterpretq_u32_f32(t) _M128i(t) - -uint32x4_t vreinterpretq_u32_p16 (poly16x8_t t); -#define vreinterpretq_u32_p16 - -uint32x4_t vreinterpretq_u32_p8 (poly8x16_t t); -#define vreinterpretq_u32_p8 - -#endif /* NEON2SSE_H */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/avx2intrin.h b/lib/gcc/x86_64-linux-android/4.8/include/avx2intrin.h deleted file mode 100644 index 1537bf5..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/avx2intrin.h +++ /dev/null @@ -1,1873 +0,0 @@ -/* Copyright (C) 2011-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _IMMINTRIN_H_INCLUDED -# error "Never use <avx2intrin.h> directly; include <immintrin.h> instead." -#endif - -/* Sum absolute 8-bit integer difference of adjacent groups of 4 - byte integers in the first 2 operands. Starting offsets within - operands are determined by the 3rd mask operand. */ -#ifdef __OPTIMIZE__ -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mpsadbw_epu8 (__m256i __X, __m256i __Y, const int __M) -{ - return (__m256i) __builtin_ia32_mpsadbw256 ((__v32qi)__X, - (__v32qi)__Y, __M); -} -#else -#define _mm256_mpsadbw_epu8(X, Y, M) \ - ((__m256i) __builtin_ia32_mpsadbw256 ((__v32qi)(__m256i)(X), \ - (__v32qi)(__m256i)(Y), (int)(M))) -#endif - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_abs_epi8 (__m256i __A) -{ - return (__m256i)__builtin_ia32_pabsb256 ((__v32qi)__A); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_abs_epi16 (__m256i __A) -{ - return (__m256i)__builtin_ia32_pabsw256 ((__v16hi)__A); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_abs_epi32 (__m256i __A) -{ - return (__m256i)__builtin_ia32_pabsd256 ((__v8si)__A); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_packs_epi32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_packssdw256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_packs_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_packsswb256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_packus_epi32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_packusdw256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_packus_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_packuswb256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_add_epi8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_paddb256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_add_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_paddw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_add_epi32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_paddd256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_add_epi64 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_paddq256 ((__v4di)__A, (__v4di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_adds_epi8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_paddsb256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_adds_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_paddsw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_adds_epu8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_paddusb256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_adds_epu16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_paddusw256 ((__v16hi)__A, (__v16hi)__B); -} - -#ifdef __OPTIMIZE__ -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_alignr_epi8 (__m256i __A, __m256i __B, const int __N) -{ - return (__m256i) __builtin_ia32_palignr256 ((__v4di)__A, - (__v4di)__B, - __N * 8); -} -#else -/* In that case (__N*8) will be in vreg, and insn will not be matched. */ -/* Use define instead */ -#define _mm256_alignr_epi8(A, B, N) \ - ((__m256i) __builtin_ia32_palignr256 ((__v4di)(__m256i)(A), \ - (__v4di)(__m256i)(B), \ - (int)(N) * 8)) -#endif - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_and_si256 (__m256i __A, __m256i __B) -{ - return (__m256i) __builtin_ia32_andsi256 ((__v4di)__A, (__v4di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_andnot_si256 (__m256i __A, __m256i __B) -{ - return (__m256i) __builtin_ia32_andnotsi256 ((__v4di)__A, (__v4di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_avg_epu8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pavgb256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_avg_epu16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pavgw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_blendv_epi8 (__m256i __X, __m256i __Y, __m256i __M) -{ - return (__m256i) __builtin_ia32_pblendvb256 ((__v32qi)__X, - (__v32qi)__Y, - (__v32qi)__M); -} - -#ifdef __OPTIMIZE__ -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_blend_epi16 (__m256i __X, __m256i __Y, const int __M) -{ - return (__m256i) __builtin_ia32_pblendw256 ((__v16hi)__X, - (__v16hi)__Y, - __M); -} -#else -#define _mm256_blend_epi16(X, Y, M) \ - ((__m256i) __builtin_ia32_pblendw256 ((__v16hi)(__m256i)(X), \ - (__v16hi)(__m256i)(Y), (int)(M))) -#endif - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cmpeq_epi8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pcmpeqb256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cmpeq_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pcmpeqw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cmpeq_epi32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pcmpeqd256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cmpeq_epi64 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pcmpeqq256 ((__v4di)__A, (__v4di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cmpgt_epi8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pcmpgtb256 ((__v32qi)__A, - (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cmpgt_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pcmpgtw256 ((__v16hi)__A, - (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cmpgt_epi32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pcmpgtd256 ((__v8si)__A, - (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cmpgt_epi64 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pcmpgtq256 ((__v4di)__A, (__v4di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_hadd_epi16 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_phaddw256 ((__v16hi)__X, - (__v16hi)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_hadd_epi32 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_phaddd256 ((__v8si)__X, (__v8si)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_hadds_epi16 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_phaddsw256 ((__v16hi)__X, - (__v16hi)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_hsub_epi16 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_phsubw256 ((__v16hi)__X, - (__v16hi)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_hsub_epi32 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_phsubd256 ((__v8si)__X, (__v8si)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_hsubs_epi16 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_phsubsw256 ((__v16hi)__X, - (__v16hi)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_maddubs_epi16 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_pmaddubsw256 ((__v32qi)__X, - (__v32qi)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_madd_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmaddwd256 ((__v16hi)__A, - (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_max_epi8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmaxsb256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_max_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmaxsw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_max_epi32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmaxsd256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_max_epu8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmaxub256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_max_epu16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmaxuw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_max_epu32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmaxud256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_min_epi8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pminsb256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_min_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pminsw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_min_epi32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pminsd256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_min_epu8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pminub256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_min_epu16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pminuw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_min_epu32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pminud256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline int -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_movemask_epi8 (__m256i __A) -{ - return __builtin_ia32_pmovmskb256 ((__v32qi)__A); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepi8_epi16 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovsxbw256 ((__v16qi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepi8_epi32 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovsxbd256 ((__v16qi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepi8_epi64 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovsxbq256 ((__v16qi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepi16_epi32 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovsxwd256 ((__v8hi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepi16_epi64 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovsxwq256 ((__v8hi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepi32_epi64 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovsxdq256 ((__v4si)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepu8_epi16 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovzxbw256 ((__v16qi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepu8_epi32 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovzxbd256 ((__v16qi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepu8_epi64 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovzxbq256 ((__v16qi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepu16_epi32 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovzxwd256 ((__v8hi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepu16_epi64 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovzxwq256 ((__v8hi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepu32_epi64 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pmovzxdq256 ((__v4si)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mul_epi32 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_pmuldq256 ((__v8si)__X, (__v8si)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mulhrs_epi16 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_pmulhrsw256 ((__v16hi)__X, - (__v16hi)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mulhi_epu16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmulhuw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mulhi_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmulhw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mullo_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmullw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mullo_epi32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmulld256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mul_epu32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pmuludq256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_or_si256 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_por256 ((__v4di)__A, (__v4di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sad_epu8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_psadbw256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_shuffle_epi8 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_pshufb256 ((__v32qi)__X, - (__v32qi)__Y); -} - -#ifdef __OPTIMIZE__ -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_shuffle_epi32 (__m256i __A, const int __mask) -{ - return (__m256i)__builtin_ia32_pshufd256 ((__v8si)__A, __mask); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_shufflehi_epi16 (__m256i __A, const int __mask) -{ - return (__m256i)__builtin_ia32_pshufhw256 ((__v16hi)__A, __mask); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_shufflelo_epi16 (__m256i __A, const int __mask) -{ - return (__m256i)__builtin_ia32_pshuflw256 ((__v16hi)__A, __mask); -} -#else -#define _mm256_shuffle_epi32(A, N) \ - ((__m256i)__builtin_ia32_pshufd256 ((__v8si)(__m256i)(A), (int)(N))) -#define _mm256_shufflehi_epi16(A, N) \ - ((__m256i)__builtin_ia32_pshufhw256 ((__v16hi)(__m256i)(A), (int)(N))) -#define _mm256_shufflelo_epi16(A, N) \ - ((__m256i)__builtin_ia32_pshuflw256 ((__v16hi)(__m256i)(A), (int)(N))) -#endif - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sign_epi8 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_psignb256 ((__v32qi)__X, (__v32qi)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sign_epi16 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_psignw256 ((__v16hi)__X, (__v16hi)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sign_epi32 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_psignd256 ((__v8si)__X, (__v8si)__Y); -} - -#ifdef __OPTIMIZE__ -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_slli_si256 (__m256i __A, const int __N) -{ - return (__m256i)__builtin_ia32_pslldqi256 (__A, __N * 8); -} -#else -#define _mm256_slli_si256(A, N) \ - ((__m256i)__builtin_ia32_pslldqi256 ((__m256i)(A), (int)(N) * 8)) -#endif - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_slli_epi16 (__m256i __A, int __B) -{ - return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sll_epi16 (__m256i __A, __m128i __B) -{ - return (__m256i)__builtin_ia32_psllw256((__v16hi)__A, (__v8hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_slli_epi32 (__m256i __A, int __B) -{ - return (__m256i)__builtin_ia32_pslldi256 ((__v8si)__A, __B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sll_epi32 (__m256i __A, __m128i __B) -{ - return (__m256i)__builtin_ia32_pslld256((__v8si)__A, (__v4si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_slli_epi64 (__m256i __A, int __B) -{ - return (__m256i)__builtin_ia32_psllqi256 ((__v4di)__A, __B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sll_epi64 (__m256i __A, __m128i __B) -{ - return (__m256i)__builtin_ia32_psllq256((__v4di)__A, (__v2di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srai_epi16 (__m256i __A, int __B) -{ - return (__m256i)__builtin_ia32_psrawi256 ((__v16hi)__A, __B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sra_epi16 (__m256i __A, __m128i __B) -{ - return (__m256i)__builtin_ia32_psraw256 ((__v16hi)__A, (__v8hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srai_epi32 (__m256i __A, int __B) -{ - return (__m256i)__builtin_ia32_psradi256 ((__v8si)__A, __B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sra_epi32 (__m256i __A, __m128i __B) -{ - return (__m256i)__builtin_ia32_psrad256 ((__v8si)__A, (__v4si)__B); -} - -#ifdef __OPTIMIZE__ -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srli_si256 (__m256i __A, const int __N) -{ - return (__m256i)__builtin_ia32_psrldqi256 (__A, __N * 8); -} -#else -#define _mm256_srli_si256(A, N) \ - ((__m256i)__builtin_ia32_psrldqi256 ((__m256i)(A), (int)(N) * 8)) -#endif - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srli_epi16 (__m256i __A, int __B) -{ - return (__m256i)__builtin_ia32_psrlwi256 ((__v16hi)__A, __B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srl_epi16 (__m256i __A, __m128i __B) -{ - return (__m256i)__builtin_ia32_psrlw256((__v16hi)__A, (__v8hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srli_epi32 (__m256i __A, int __B) -{ - return (__m256i)__builtin_ia32_psrldi256 ((__v8si)__A, __B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srl_epi32 (__m256i __A, __m128i __B) -{ - return (__m256i)__builtin_ia32_psrld256((__v8si)__A, (__v4si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srli_epi64 (__m256i __A, int __B) -{ - return (__m256i)__builtin_ia32_psrlqi256 ((__v4di)__A, __B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srl_epi64 (__m256i __A, __m128i __B) -{ - return (__m256i)__builtin_ia32_psrlq256((__v4di)__A, (__v2di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sub_epi8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_psubb256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sub_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_psubw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sub_epi32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_psubd256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sub_epi64 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_psubq256 ((__v4di)__A, (__v4di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_subs_epi8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_psubsb256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_subs_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_psubsw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_subs_epu8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_psubusb256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_subs_epu16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_psubusw256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpackhi_epi8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_punpckhbw256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpackhi_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_punpckhwd256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpackhi_epi32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_punpckhdq256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpackhi_epi64 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_punpckhqdq256 ((__v4di)__A, (__v4di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpacklo_epi8 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_punpcklbw256 ((__v32qi)__A, (__v32qi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpacklo_epi16 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_punpcklwd256 ((__v16hi)__A, (__v16hi)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpacklo_epi32 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_punpckldq256 ((__v8si)__A, (__v8si)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpacklo_epi64 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_punpcklqdq256 ((__v4di)__A, (__v4di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_xor_si256 (__m256i __A, __m256i __B) -{ - return (__m256i)__builtin_ia32_pxor256 ((__v4di)__A, (__v4di)__B); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_stream_load_si256 (__m256i const *__X) -{ - return (__m256i) __builtin_ia32_movntdqa256 ((__v4di *) __X); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_broadcastss_ps (__m128 __X) -{ - return (__m128) __builtin_ia32_vbroadcastss_ps ((__v4sf)__X); -} - -extern __inline __m256 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_broadcastss_ps (__m128 __X) -{ - return (__m256) __builtin_ia32_vbroadcastss_ps256 ((__v4sf)__X); -} - -extern __inline __m256d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_broadcastsd_pd (__m128d __X) -{ - return (__m256d) __builtin_ia32_vbroadcastsd_pd256 ((__v2df)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_broadcastsi128_si256 (__m128i __X) -{ - return (__m256i) __builtin_ia32_vbroadcastsi256 ((__v2di)__X); -} - -#ifdef __OPTIMIZE__ -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_blend_epi32 (__m128i __X, __m128i __Y, const int __M) -{ - return (__m128i) __builtin_ia32_pblendd128 ((__v4si)__X, - (__v4si)__Y, - __M); -} -#else -#define _mm_blend_epi32(X, Y, M) \ - ((__m128i) __builtin_ia32_pblendd128 ((__v4si)(__m128i)(X), \ - (__v4si)(__m128i)(Y), (int)(M))) -#endif - -#ifdef __OPTIMIZE__ -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_blend_epi32 (__m256i __X, __m256i __Y, const int __M) -{ - return (__m256i) __builtin_ia32_pblendd256 ((__v8si)__X, - (__v8si)__Y, - __M); -} -#else -#define _mm256_blend_epi32(X, Y, M) \ - ((__m256i) __builtin_ia32_pblendd256 ((__v8si)(__m256i)(X), \ - (__v8si)(__m256i)(Y), (int)(M))) -#endif - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_broadcastb_epi8 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pbroadcastb256 ((__v16qi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_broadcastw_epi16 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pbroadcastw256 ((__v8hi)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_broadcastd_epi32 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pbroadcastd256 ((__v4si)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_broadcastq_epi64 (__m128i __X) -{ - return (__m256i) __builtin_ia32_pbroadcastq256 ((__v2di)__X); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_broadcastb_epi8 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pbroadcastb128 ((__v16qi)__X); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_broadcastw_epi16 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pbroadcastw128 ((__v8hi)__X); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_broadcastd_epi32 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pbroadcastd128 ((__v4si)__X); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_broadcastq_epi64 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pbroadcastq128 ((__v2di)__X); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permutevar8x32_epi32 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_permvarsi256 ((__v8si)__X, (__v8si)__Y); -} - -#ifdef __OPTIMIZE__ -extern __inline __m256d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permute4x64_pd (__m256d __X, const int __M) -{ - return (__m256d) __builtin_ia32_permdf256 ((__v4df)__X, __M); -} -#else -#define _mm256_permute4x64_pd(X, M) \ - ((__m256d) __builtin_ia32_permdf256 ((__v4df)(__m256d)(X), (int)(M))) -#endif - -extern __inline __m256 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permutevar8x32_ps (__m256 __X, __m256i __Y) -{ - return (__m256) __builtin_ia32_permvarsf256 ((__v8sf)__X, (__v8si)__Y); -} - -#ifdef __OPTIMIZE__ -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permute4x64_epi64 (__m256i __X, const int __M) -{ - return (__m256i) __builtin_ia32_permdi256 ((__v4di)__X, __M); -} -#else -#define _mm256_permute4x64_epi64(X, M) \ - ((__m256i) __builtin_ia32_permdi256 ((__v4di)(__m256i)(X), (int)(M))) -#endif - - -#ifdef __OPTIMIZE__ -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permute2x128_si256 (__m256i __X, __m256i __Y, const int __M) -{ - return (__m256i) __builtin_ia32_permti256 ((__v4di)__X, (__v4di)__Y, __M); -} -#else -#define _mm256_permute2x128_si256(X, Y, M) \ - ((__m256i) __builtin_ia32_permti256 ((__v4di)(__m256i)(X), (__v4di)(__m256i)(Y), (int)(M))) -#endif - -#ifdef __OPTIMIZE__ -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_extracti128_si256 (__m256i __X, const int __M) -{ - return (__m128i) __builtin_ia32_extract128i256 ((__v4di)__X, __M); -} -#else -#define _mm256_extracti128_si256(X, M) \ - ((__m128i) __builtin_ia32_extract128i256 ((__v4di)(__m256i)(X), (int)(M))) -#endif - -#ifdef __OPTIMIZE__ -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_inserti128_si256 (__m256i __X, __m128i __Y, const int __M) -{ - return (__m256i) __builtin_ia32_insert128i256 ((__v4di)__X, (__v2di)__Y, __M); -} -#else -#define _mm256_inserti128_si256(X, Y, M) \ - ((__m256i) __builtin_ia32_insert128i256 ((__v4di)(__m256i)(X), \ - (__v2di)(__m128i)(Y), \ - (int)(M))) -#endif - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_maskload_epi32 (int const *__X, __m256i __M ) -{ - return (__m256i) __builtin_ia32_maskloadd256 ((const __v8si *)__X, - (__v8si)__M); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_maskload_epi64 (long long const *__X, __m256i __M ) -{ - return (__m256i) __builtin_ia32_maskloadq256 ((const __v4di *)__X, - (__v4di)__M); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskload_epi32 (int const *__X, __m128i __M ) -{ - return (__m128i) __builtin_ia32_maskloadd ((const __v4si *)__X, - (__v4si)__M); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskload_epi64 (long long const *__X, __m128i __M ) -{ - return (__m128i) __builtin_ia32_maskloadq ((const __v2di *)__X, - (__v2di)__M); -} - -extern __inline void -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_maskstore_epi32 (int *__X, __m256i __M, __m256i __Y ) -{ - __builtin_ia32_maskstored256 ((__v8si *)__X, (__v8si)__M, (__v8si)__Y); -} - -extern __inline void -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_maskstore_epi64 (long long *__X, __m256i __M, __m256i __Y ) -{ - __builtin_ia32_maskstoreq256 ((__v4di *)__X, (__v4di)__M, (__v4di)__Y); -} - -extern __inline void -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskstore_epi32 (int *__X, __m128i __M, __m128i __Y ) -{ - __builtin_ia32_maskstored ((__v4si *)__X, (__v4si)__M, (__v4si)__Y); -} - -extern __inline void -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskstore_epi64 (long long *__X, __m128i __M, __m128i __Y ) -{ - __builtin_ia32_maskstoreq (( __v2di *)__X, (__v2di)__M, (__v2di)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sllv_epi32 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_psllv8si ((__v8si)__X, (__v8si)__Y); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sllv_epi32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_psllv4si ((__v4si)__X, (__v4si)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sllv_epi64 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_psllv4di ((__v4di)__X, (__v4di)__Y); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sllv_epi64 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_psllv2di ((__v2di)__X, (__v2di)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srav_epi32 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_psrav8si ((__v8si)__X, (__v8si)__Y); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srav_epi32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_psrav4si ((__v4si)__X, (__v4si)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srlv_epi32 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_psrlv8si ((__v8si)__X, (__v8si)__Y); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srlv_epi32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_psrlv4si ((__v4si)__X, (__v4si)__Y); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_srlv_epi64 (__m256i __X, __m256i __Y) -{ - return (__m256i) __builtin_ia32_psrlv4di ((__v4di)__X, (__v4di)__Y); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srlv_epi64 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_psrlv2di ((__v2di)__X, (__v2di)__Y); -} - -#ifdef __OPTIMIZE__ -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_i32gather_pd (double const *base, __m128i index, const int scale) -{ - __v2df src = _mm_setzero_pd (); - __v2df mask = _mm_cmpeq_pd (src, src); - - return (__m128d) __builtin_ia32_gathersiv2df (src, - base, - (__v4si)index, - mask, - scale); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_i32gather_pd (__m128d src, double const *base, __m128i index, - __m128d mask, const int scale) -{ - return (__m128d) __builtin_ia32_gathersiv2df ((__v2df)src, - base, - (__v4si)index, - (__v2df)mask, - scale); -} - -extern __inline __m256d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_i32gather_pd (double const *base, __m128i index, const int scale) -{ - __v4df src = _mm256_setzero_pd (); - __v4df mask = _mm256_cmp_pd (src, src, _CMP_EQ_OQ); - - return (__m256d) __builtin_ia32_gathersiv4df (src, - base, - (__v4si)index, - mask, - scale); -} - -extern __inline __m256d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mask_i32gather_pd (__m256d src, double const *base, - __m128i index, __m256d mask, const int scale) -{ - return (__m256d) __builtin_ia32_gathersiv4df ((__v4df)src, - base, - (__v4si)index, - (__v4df)mask, - scale); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_i64gather_pd (double const *base, __m128i index, const int scale) -{ - __v2df src = _mm_setzero_pd (); - __v2df mask = _mm_cmpeq_pd (src, src); - - return (__m128d) __builtin_ia32_gatherdiv2df (src, - base, - (__v2di)index, - mask, - scale); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_i64gather_pd (__m128d src, double const *base, __m128i index, - __m128d mask, const int scale) -{ - return (__m128d) __builtin_ia32_gatherdiv2df ((__v2df)src, - base, - (__v2di)index, - (__v2df)mask, - scale); -} - -extern __inline __m256d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_i64gather_pd (double const *base, __m256i index, const int scale) -{ - __v4df src = _mm256_setzero_pd (); - __v4df mask = _mm256_cmp_pd (src, src, _CMP_EQ_OQ); - - return (__m256d) __builtin_ia32_gatherdiv4df (src, - base, - (__v4di)index, - mask, - scale); -} - -extern __inline __m256d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mask_i64gather_pd (__m256d src, double const *base, - __m256i index, __m256d mask, const int scale) -{ - return (__m256d) __builtin_ia32_gatherdiv4df ((__v4df)src, - base, - (__v4di)index, - (__v4df)mask, - scale); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_i32gather_ps (float const *base, __m128i index, const int scale) -{ - __v4sf src = _mm_setzero_ps (); - __v4sf mask = _mm_cmpeq_ps (src, src); - - return (__m128) __builtin_ia32_gathersiv4sf (src, - base, - (__v4si)index, - mask, - scale); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_i32gather_ps (__m128 src, float const *base, __m128i index, - __m128 mask, const int scale) -{ - return (__m128) __builtin_ia32_gathersiv4sf ((__v4sf)src, - base, - (__v4si)index, - (__v4sf)mask, - scale); -} - -extern __inline __m256 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_i32gather_ps (float const *base, __m256i index, const int scale) -{ - __v8sf src = _mm256_setzero_ps (); - __v8sf mask = _mm256_cmp_ps (src, src, _CMP_EQ_OQ); - - return (__m256) __builtin_ia32_gathersiv8sf (src, - base, - (__v8si)index, - mask, - scale); -} - -extern __inline __m256 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mask_i32gather_ps (__m256 src, float const *base, - __m256i index, __m256 mask, const int scale) -{ - return (__m256) __builtin_ia32_gathersiv8sf ((__v8sf)src, - base, - (__v8si)index, - (__v8sf)mask, - scale); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_i64gather_ps (float const *base, __m128i index, const int scale) -{ - __v4sf src = _mm_setzero_ps (); - __v4sf mask = _mm_cmpeq_ps (src, src); - - return (__m128) __builtin_ia32_gatherdiv4sf (src, - base, - (__v2di)index, - mask, - scale); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_i64gather_ps (__m128 src, float const *base, __m128i index, - __m128 mask, const int scale) -{ - return (__m128) __builtin_ia32_gatherdiv4sf ((__v4sf)src, - base, - (__v2di)index, - (__v4sf)mask, - scale); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_i64gather_ps (float const *base, __m256i index, const int scale) -{ - __v4sf src = _mm_setzero_ps (); - __v4sf mask = _mm_cmpeq_ps (src, src); - - return (__m128) __builtin_ia32_gatherdiv4sf256 (src, - base, - (__v4di)index, - mask, - scale); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mask_i64gather_ps (__m128 src, float const *base, - __m256i index, __m128 mask, const int scale) -{ - return (__m128) __builtin_ia32_gatherdiv4sf256 ((__v4sf)src, - base, - (__v4di)index, - (__v4sf)mask, - scale); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_i32gather_epi64 (long long int const *base, - __m128i index, const int scale) -{ - __v2di src = __extension__ (__v2di){ 0, 0 }; - __v2di mask = __extension__ (__v2di){ ~0, ~0 }; - - return (__m128i) __builtin_ia32_gathersiv2di (src, - base, - (__v4si)index, - mask, - scale); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_i32gather_epi64 (__m128i src, long long int const *base, - __m128i index, __m128i mask, const int scale) -{ - return (__m128i) __builtin_ia32_gathersiv2di ((__v2di)src, - base, - (__v4si)index, - (__v2di)mask, - scale); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_i32gather_epi64 (long long int const *base, - __m128i index, const int scale) -{ - __v4di src = __extension__ (__v4di){ 0, 0, 0, 0 }; - __v4di mask = __extension__ (__v4di){ ~0, ~0, ~0, ~0 }; - - return (__m256i) __builtin_ia32_gathersiv4di (src, - base, - (__v4si)index, - mask, - scale); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mask_i32gather_epi64 (__m256i src, long long int const *base, - __m128i index, __m256i mask, const int scale) -{ - return (__m256i) __builtin_ia32_gathersiv4di ((__v4di)src, - base, - (__v4si)index, - (__v4di)mask, - scale); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_i64gather_epi64 (long long int const *base, - __m128i index, const int scale) -{ - __v2di src = __extension__ (__v2di){ 0, 0 }; - __v2di mask = __extension__ (__v2di){ ~0, ~0 }; - - return (__m128i) __builtin_ia32_gatherdiv2di (src, - base, - (__v2di)index, - mask, - scale); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_i64gather_epi64 (__m128i src, long long int const *base, __m128i index, - __m128i mask, const int scale) -{ - return (__m128i) __builtin_ia32_gatherdiv2di ((__v2di)src, - base, - (__v2di)index, - (__v2di)mask, - scale); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_i64gather_epi64 (long long int const *base, - __m256i index, const int scale) -{ - __v4di src = __extension__ (__v4di){ 0, 0, 0, 0 }; - __v4di mask = __extension__ (__v4di){ ~0, ~0, ~0, ~0 }; - - return (__m256i) __builtin_ia32_gatherdiv4di (src, - base, - (__v4di)index, - mask, - scale); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mask_i64gather_epi64 (__m256i src, long long int const *base, - __m256i index, __m256i mask, const int scale) -{ - return (__m256i) __builtin_ia32_gatherdiv4di ((__v4di)src, - base, - (__v4di)index, - (__v4di)mask, - scale); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_i32gather_epi32 (int const *base, __m128i index, const int scale) -{ - __v4si src = __extension__ (__v4si){ 0, 0, 0, 0 }; - __v4si mask = __extension__ (__v4si){ ~0, ~0, ~0, ~0 }; - - return (__m128i) __builtin_ia32_gathersiv4si (src, - base, - (__v4si)index, - mask, - scale); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_i32gather_epi32 (__m128i src, int const *base, __m128i index, - __m128i mask, const int scale) -{ - return (__m128i) __builtin_ia32_gathersiv4si ((__v4si)src, - base, - (__v4si)index, - (__v4si)mask, - scale); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_i32gather_epi32 (int const *base, __m256i index, const int scale) -{ - __v8si src = __extension__ (__v8si){ 0, 0, 0, 0, 0, 0, 0, 0 }; - __v8si mask = __extension__ (__v8si){ ~0, ~0, ~0, ~0, ~0, ~0, ~0, ~0 }; - - return (__m256i) __builtin_ia32_gathersiv8si (src, - base, - (__v8si)index, - mask, - scale); -} - -extern __inline __m256i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mask_i32gather_epi32 (__m256i src, int const *base, - __m256i index, __m256i mask, const int scale) -{ - return (__m256i) __builtin_ia32_gathersiv8si ((__v8si)src, - base, - (__v8si)index, - (__v8si)mask, - scale); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_i64gather_epi32 (int const *base, __m128i index, const int scale) -{ - __v4si src = __extension__ (__v4si){ 0, 0, 0, 0 }; - __v4si mask = __extension__ (__v4si){ ~0, ~0, ~0, ~0 }; - - return (__m128i) __builtin_ia32_gatherdiv4si (src, - base, - (__v2di)index, - mask, - scale); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_i64gather_epi32 (__m128i src, int const *base, __m128i index, - __m128i mask, const int scale) -{ - return (__m128i) __builtin_ia32_gatherdiv4si ((__v4si)src, - base, - (__v2di)index, - (__v4si)mask, - scale); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_i64gather_epi32 (int const *base, __m256i index, const int scale) -{ - __v4si src = __extension__ (__v4si){ 0, 0, 0, 0 }; - __v4si mask = __extension__ (__v4si){ ~0, ~0, ~0, ~0 }; - - return (__m128i) __builtin_ia32_gatherdiv4si256 (src, - base, - (__v4di)index, - mask, - scale); -} - -extern __inline __m128i -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mask_i64gather_epi32 (__m128i src, int const *base, - __m256i index, __m128i mask, const int scale) -{ - return (__m128i) __builtin_ia32_gatherdiv4si256 ((__v4si)src, - base, - (__v4di)index, - (__v4si)mask, - scale); -} -#else /* __OPTIMIZE__ */ -#define _mm_i32gather_pd(BASE, INDEX, SCALE) \ - (__m128d) __builtin_ia32_gathersiv2df ((__v2df) _mm_setzero_pd (), \ - (double const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - (__v2df)_mm_set1_pd( \ - (double)(long long int) -1), \ - (int)SCALE) - -#define _mm_mask_i32gather_pd(SRC, BASE, INDEX, MASK, SCALE) \ - (__m128d) __builtin_ia32_gathersiv2df ((__v2df)(__m128d)SRC, \ - (double const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - (__v2df)(__m128d)MASK, \ - (int)SCALE) - -#define _mm256_i32gather_pd(BASE, INDEX, SCALE) \ - (__m256d) __builtin_ia32_gathersiv4df ((__v4df) _mm256_setzero_pd (), \ - (double const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - (__v4df)_mm256_set1_pd( \ - (double)(long long int) -1), \ - (int)SCALE) - -#define _mm256_mask_i32gather_pd(SRC, BASE, INDEX, MASK, SCALE) \ - (__m256d) __builtin_ia32_gathersiv4df ((__v4df)(__m256d)SRC, \ - (double const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - (__v4df)(__m256d)MASK, \ - (int)SCALE) - -#define _mm_i64gather_pd(BASE, INDEX, SCALE) \ - (__m128d) __builtin_ia32_gatherdiv2df ((__v2df) _mm_setzero_pd (), \ - (double const *)BASE, \ - (__v2di)(__m128i)INDEX, \ - (__v2df)_mm_set1_pd( \ - (double)(long long int) -1), \ - (int)SCALE) - -#define _mm_mask_i64gather_pd(SRC, BASE, INDEX, MASK, SCALE) \ - (__m128d) __builtin_ia32_gatherdiv2df ((__v2df)(__m128d)SRC, \ - (double const *)BASE, \ - (__v2di)(__m128i)INDEX, \ - (__v2df)(__m128d)MASK, \ - (int)SCALE) - -#define _mm256_i64gather_pd(BASE, INDEX, SCALE) \ - (__m256d) __builtin_ia32_gatherdiv4df ((__v4df) _mm256_setzero_pd (), \ - (double const *)BASE, \ - (__v4di)(__m256i)INDEX, \ - (__v4df)_mm256_set1_pd( \ - (double)(long long int) -1), \ - (int)SCALE) - -#define _mm256_mask_i64gather_pd(SRC, BASE, INDEX, MASK, SCALE) \ - (__m256d) __builtin_ia32_gatherdiv4df ((__v4df)(__m256d)SRC, \ - (double const *)BASE, \ - (__v4di)(__m256i)INDEX, \ - (__v4df)(__m256d)MASK, \ - (int)SCALE) - -#define _mm_i32gather_ps(BASE, INDEX, SCALE) \ - (__m128) __builtin_ia32_gathersiv4sf ((__v4sf) _mm_setzero_ps (), \ - (float const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - _mm_set1_ps ((float)(int) -1), \ - (int)SCALE) - -#define _mm_mask_i32gather_ps(SRC, BASE, INDEX, MASK, SCALE) \ - (__m128) __builtin_ia32_gathersiv4sf ((__v4sf)(__m128d)SRC, \ - (float const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - (__v4sf)(__m128d)MASK, \ - (int)SCALE) - -#define _mm256_i32gather_ps(BASE, INDEX, SCALE) \ - (__m256) __builtin_ia32_gathersiv8sf ((__v8sf) _mm256_setzero_ps (), \ - (float const *)BASE, \ - (__v8si)(__m256i)INDEX, \ - (__v8sf)_mm256_set1_ps ( \ - (float)(int) -1), \ - (int)SCALE) - -#define _mm256_mask_i32gather_ps(SRC, BASE, INDEX, MASK, SCALE) \ - (__m256) __builtin_ia32_gathersiv8sf ((__v8sf)(__m256)SRC, \ - (float const *)BASE, \ - (__v8si)(__m256i)INDEX, \ - (__v8sf)(__m256d)MASK, \ - (int)SCALE) - -#define _mm_i64gather_ps(BASE, INDEX, SCALE) \ - (__m128) __builtin_ia32_gatherdiv4sf ((__v4sf) _mm_setzero_pd (), \ - (float const *)BASE, \ - (__v2di)(__m128i)INDEX, \ - (__v4sf)_mm_set1_ps ( \ - (float)(int) -1), \ - (int)SCALE) - -#define _mm_mask_i64gather_ps(SRC, BASE, INDEX, MASK, SCALE) \ - (__m128) __builtin_ia32_gatherdiv4sf ((__v4sf)(__m128)SRC, \ - (float const *)BASE, \ - (__v2di)(__m128i)INDEX, \ - (__v4sf)(__m128d)MASK, \ - (int)SCALE) - -#define _mm256_i64gather_ps(BASE, INDEX, SCALE) \ - (__m128) __builtin_ia32_gatherdiv4sf256 ((__v4sf) _mm_setzero_ps (), \ - (float const *)BASE, \ - (__v4di)(__m256i)INDEX, \ - (__v4sf)_mm_set1_ps( \ - (float)(int) -1), \ - (int)SCALE) - -#define _mm256_mask_i64gather_ps(SRC, BASE, INDEX, MASK, SCALE) \ - (__m128) __builtin_ia32_gatherdiv4sf256 ((__v4sf)(__m128)SRC, \ - (float const *)BASE, \ - (__v4di)(__m256i)INDEX, \ - (__v4sf)(__m128)MASK, \ - (int)SCALE) - -#define _mm_i32gather_epi64(BASE, INDEX, SCALE) \ - (__m128i) __builtin_ia32_gathersiv2di ((__v2di) _mm_setzero_si128 (), \ - (long long const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - (__v2di)_mm_set1_epi64x (-1), \ - (int)SCALE) - -#define _mm_mask_i32gather_epi64(SRC, BASE, INDEX, MASK, SCALE) \ - (__m128i) __builtin_ia32_gathersiv2di ((__v2di)(__m128i)SRC, \ - (long long const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - (__v2di)(__m128i)MASK, \ - (int)SCALE) - -#define _mm256_i32gather_epi64(BASE, INDEX, SCALE) \ - (__m256i) __builtin_ia32_gathersiv4di ((__v4di) _mm256_setzero_si256 (), \ - (long long const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - (__v4di)_mm256_set1_epi64x (-1), \ - (int)SCALE) - -#define _mm256_mask_i32gather_epi64(SRC, BASE, INDEX, MASK, SCALE) \ - (__m256i) __builtin_ia32_gathersiv4di ((__v4di)(__m256i)SRC, \ - (long long const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - (__v4di)(__m256i)MASK, \ - (int)SCALE) - -#define _mm_i64gather_epi64(BASE, INDEX, SCALE) \ - (__m128i) __builtin_ia32_gatherdiv2di ((__v2di) _mm_setzero_si128 (), \ - (long long const *)BASE, \ - (__v2di)(__m128i)INDEX, \ - (__v2di)_mm_set1_epi64x (-1), \ - (int)SCALE) - -#define _mm_mask_i64gather_epi64(SRC, BASE, INDEX, MASK, SCALE) \ - (__m128i) __builtin_ia32_gatherdiv2di ((__v2di)(__m128i)SRC, \ - (long long const *)BASE, \ - (__v2di)(__m128i)INDEX, \ - (__v2di)(__m128i)MASK, \ - (int)SCALE) - -#define _mm256_i64gather_epi64(BASE, INDEX, SCALE) \ - (__m256i) __builtin_ia32_gatherdiv4di ((__v4di) _mm256_setzero_si256 (), \ - (long long const *)BASE, \ - (__v4di)(__m256i)INDEX, \ - (__v4di)_mm256_set1_epi64x (-1), \ - (int)SCALE) - -#define _mm256_mask_i64gather_epi64(SRC, BASE, INDEX, MASK, SCALE) \ - (__m256i) __builtin_ia32_gatherdiv4di ((__v4di)(__m256i)SRC, \ - (long long const *)BASE, \ - (__v4di)(__m256i)INDEX, \ - (__v4di)(__m256i)MASK, \ - (int)SCALE) - -#define _mm_i32gather_epi32(BASE, INDEX, SCALE) \ - (__m128i) __builtin_ia32_gathersiv4si ((__v4si) _mm_setzero_si128 (), \ - (int const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - (__v4si)_mm_set1_epi32 (-1), \ - (int)SCALE) - -#define _mm_mask_i32gather_epi32(SRC, BASE, INDEX, MASK, SCALE) \ - (__m128i) __builtin_ia32_gathersiv4si ((__v4si)(__m128i)SRC, \ - (int const *)BASE, \ - (__v4si)(__m128i)INDEX, \ - (__v4si)(__m128i)MASK, \ - (int)SCALE) - -#define _mm256_i32gather_epi32(BASE, INDEX, SCALE) \ - (__m256i) __builtin_ia32_gathersiv8si ((__v8si) _mm256_setzero_si256 (), \ - (int const *)BASE, \ - (__v8si)(__m256i)INDEX, \ - (__v8si)_mm256_set1_epi32 (-1), \ - (int)SCALE) - -#define _mm256_mask_i32gather_epi32(SRC, BASE, INDEX, MASK, SCALE) \ - (__m256i) __builtin_ia32_gathersiv8si ((__v8si)(__m256i)SRC, \ - (int const *)BASE, \ - (__v8si)(__m256i)INDEX, \ - (__v8si)(__m256i)MASK, \ - (int)SCALE) - -#define _mm_i64gather_epi32(BASE, INDEX, SCALE) \ - (__m128i) __builtin_ia32_gatherdiv4si ((__v4si) _mm_setzero_si128 (), \ - (int const *)BASE, \ - (__v2di)(__m128i)INDEX, \ - (__v4si)_mm_set1_epi32 (-1), \ - (int)SCALE) - -#define _mm_mask_i64gather_epi32(SRC, BASE, INDEX, MASK, SCALE) \ - (__m128i) __builtin_ia32_gatherdiv4si ((__v4si)(__m128i)SRC, \ - (int const *)BASE, \ - (__v2di)(__m128i)INDEX, \ - (__v4si)(__m128i)MASK, \ - (int)SCALE) - -#define _mm256_i64gather_epi32(BASE, INDEX, SCALE) \ - (__m128i) __builtin_ia32_gatherdiv4si256 ((__v4si) _mm_setzero_si128 (), \ - (int const *)BASE, \ - (__v4di)(__m256i)INDEX, \ - (__v4si)_mm_set1_epi32(-1), \ - (int)SCALE) - -#define _mm256_mask_i64gather_epi32(SRC, BASE, INDEX, MASK, SCALE) \ - (__m128i) __builtin_ia32_gatherdiv4si256 ((__v4si)(__m128i)SRC, \ - (int const *)BASE, \ - (__v4di)(__m256i)INDEX, \ - (__v4si)(__m128i)MASK, \ - (int)SCALE) -#endif /* __OPTIMIZE__ */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/avxintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/avxintrin.h deleted file mode 100644 index b75de45..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/avxintrin.h +++ /dev/null @@ -1,1426 +0,0 @@ -/* Copyright (C) 2008-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* Implemented from the specification included in the Intel C++ Compiler - User Guide and Reference, version 11.0. */ - -#ifndef _IMMINTRIN_H_INCLUDED -# error "Never use <avxintrin.h> directly; include <immintrin.h> instead." -#endif - -/* Internal data types for implementing the intrinsics. */ -typedef double __v4df __attribute__ ((__vector_size__ (32))); -typedef float __v8sf __attribute__ ((__vector_size__ (32))); -typedef long long __v4di __attribute__ ((__vector_size__ (32))); -typedef int __v8si __attribute__ ((__vector_size__ (32))); -typedef short __v16hi __attribute__ ((__vector_size__ (32))); -typedef char __v32qi __attribute__ ((__vector_size__ (32))); - -/* The Intel API is flexible enough that we must allow aliasing with other - vector types, and their scalar components. */ -typedef float __m256 __attribute__ ((__vector_size__ (32), - __may_alias__)); -typedef long long __m256i __attribute__ ((__vector_size__ (32), - __may_alias__)); -typedef double __m256d __attribute__ ((__vector_size__ (32), - __may_alias__)); - -/* Compare predicates for scalar and packed compare intrinsics. */ - -/* Equal (ordered, non-signaling) */ -#define _CMP_EQ_OQ 0x00 -/* Less-than (ordered, signaling) */ -#define _CMP_LT_OS 0x01 -/* Less-than-or-equal (ordered, signaling) */ -#define _CMP_LE_OS 0x02 -/* Unordered (non-signaling) */ -#define _CMP_UNORD_Q 0x03 -/* Not-equal (unordered, non-signaling) */ -#define _CMP_NEQ_UQ 0x04 -/* Not-less-than (unordered, signaling) */ -#define _CMP_NLT_US 0x05 -/* Not-less-than-or-equal (unordered, signaling) */ -#define _CMP_NLE_US 0x06 -/* Ordered (nonsignaling) */ -#define _CMP_ORD_Q 0x07 -/* Equal (unordered, non-signaling) */ -#define _CMP_EQ_UQ 0x08 -/* Not-greater-than-or-equal (unordered, signaling) */ -#define _CMP_NGE_US 0x09 -/* Not-greater-than (unordered, signaling) */ -#define _CMP_NGT_US 0x0a -/* False (ordered, non-signaling) */ -#define _CMP_FALSE_OQ 0x0b -/* Not-equal (ordered, non-signaling) */ -#define _CMP_NEQ_OQ 0x0c -/* Greater-than-or-equal (ordered, signaling) */ -#define _CMP_GE_OS 0x0d -/* Greater-than (ordered, signaling) */ -#define _CMP_GT_OS 0x0e -/* True (unordered, non-signaling) */ -#define _CMP_TRUE_UQ 0x0f -/* Equal (ordered, signaling) */ -#define _CMP_EQ_OS 0x10 -/* Less-than (ordered, non-signaling) */ -#define _CMP_LT_OQ 0x11 -/* Less-than-or-equal (ordered, non-signaling) */ -#define _CMP_LE_OQ 0x12 -/* Unordered (signaling) */ -#define _CMP_UNORD_S 0x13 -/* Not-equal (unordered, signaling) */ -#define _CMP_NEQ_US 0x14 -/* Not-less-than (unordered, non-signaling) */ -#define _CMP_NLT_UQ 0x15 -/* Not-less-than-or-equal (unordered, non-signaling) */ -#define _CMP_NLE_UQ 0x16 -/* Ordered (signaling) */ -#define _CMP_ORD_S 0x17 -/* Equal (unordered, signaling) */ -#define _CMP_EQ_US 0x18 -/* Not-greater-than-or-equal (unordered, non-signaling) */ -#define _CMP_NGE_UQ 0x19 -/* Not-greater-than (unordered, non-signaling) */ -#define _CMP_NGT_UQ 0x1a -/* False (ordered, signaling) */ -#define _CMP_FALSE_OS 0x1b -/* Not-equal (ordered, signaling) */ -#define _CMP_NEQ_OS 0x1c -/* Greater-than-or-equal (ordered, non-signaling) */ -#define _CMP_GE_OQ 0x1d -/* Greater-than (ordered, non-signaling) */ -#define _CMP_GT_OQ 0x1e -/* True (unordered, signaling) */ -#define _CMP_TRUE_US 0x1f - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_add_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_addpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_add_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_addps256 ((__v8sf)__A, (__v8sf)__B); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_addsub_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_addsubpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_addsub_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_addsubps256 ((__v8sf)__A, (__v8sf)__B); -} - - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_and_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_andpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_and_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_andnot_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_andnpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_andnot_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_andnps256 ((__v8sf)__A, (__v8sf)__B); -} - -/* Double/single precision floating point blend instructions - select - data from 2 sources using constant/variable mask. */ - -#ifdef __OPTIMIZE__ -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_blend_pd (__m256d __X, __m256d __Y, const int __M) -{ - return (__m256d) __builtin_ia32_blendpd256 ((__v4df)__X, - (__v4df)__Y, - __M); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_blend_ps (__m256 __X, __m256 __Y, const int __M) -{ - return (__m256) __builtin_ia32_blendps256 ((__v8sf)__X, - (__v8sf)__Y, - __M); -} -#else -#define _mm256_blend_pd(X, Y, M) \ - ((__m256d) __builtin_ia32_blendpd256 ((__v4df)(__m256d)(X), \ - (__v4df)(__m256d)(Y), (int)(M))) - -#define _mm256_blend_ps(X, Y, M) \ - ((__m256) __builtin_ia32_blendps256 ((__v8sf)(__m256)(X), \ - (__v8sf)(__m256)(Y), (int)(M))) -#endif - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_blendv_pd (__m256d __X, __m256d __Y, __m256d __M) -{ - return (__m256d) __builtin_ia32_blendvpd256 ((__v4df)__X, - (__v4df)__Y, - (__v4df)__M); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_blendv_ps (__m256 __X, __m256 __Y, __m256 __M) -{ - return (__m256) __builtin_ia32_blendvps256 ((__v8sf)__X, - (__v8sf)__Y, - (__v8sf)__M); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_div_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_divpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_div_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_divps256 ((__v8sf)__A, (__v8sf)__B); -} - -/* Dot product instructions with mask-defined summing and zeroing parts - of result. */ - -#ifdef __OPTIMIZE__ -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_dp_ps (__m256 __X, __m256 __Y, const int __M) -{ - return (__m256) __builtin_ia32_dpps256 ((__v8sf)__X, - (__v8sf)__Y, - __M); -} -#else -#define _mm256_dp_ps(X, Y, M) \ - ((__m256) __builtin_ia32_dpps256 ((__v8sf)(__m256)(X), \ - (__v8sf)(__m256)(Y), (int)(M))) -#endif - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_hadd_pd (__m256d __X, __m256d __Y) -{ - return (__m256d) __builtin_ia32_haddpd256 ((__v4df)__X, (__v4df)__Y); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_hadd_ps (__m256 __X, __m256 __Y) -{ - return (__m256) __builtin_ia32_haddps256 ((__v8sf)__X, (__v8sf)__Y); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_hsub_pd (__m256d __X, __m256d __Y) -{ - return (__m256d) __builtin_ia32_hsubpd256 ((__v4df)__X, (__v4df)__Y); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_hsub_ps (__m256 __X, __m256 __Y) -{ - return (__m256) __builtin_ia32_hsubps256 ((__v8sf)__X, (__v8sf)__Y); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_max_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_maxpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_max_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_maxps256 ((__v8sf)__A, (__v8sf)__B); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_min_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_minpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_min_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_minps256 ((__v8sf)__A, (__v8sf)__B); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mul_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_mulpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_mul_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_mulps256 ((__v8sf)__A, (__v8sf)__B); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_or_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_orpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_or_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_orps256 ((__v8sf)__A, (__v8sf)__B); -} - -#ifdef __OPTIMIZE__ -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_shuffle_pd (__m256d __A, __m256d __B, const int __mask) -{ - return (__m256d) __builtin_ia32_shufpd256 ((__v4df)__A, (__v4df)__B, - __mask); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_shuffle_ps (__m256 __A, __m256 __B, const int __mask) -{ - return (__m256) __builtin_ia32_shufps256 ((__v8sf)__A, (__v8sf)__B, - __mask); -} -#else -#define _mm256_shuffle_pd(A, B, N) \ - ((__m256d)__builtin_ia32_shufpd256 ((__v4df)(__m256d)(A), \ - (__v4df)(__m256d)(B), (int)(N))) - -#define _mm256_shuffle_ps(A, B, N) \ - ((__m256) __builtin_ia32_shufps256 ((__v8sf)(__m256)(A), \ - (__v8sf)(__m256)(B), (int)(N))) -#endif - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sub_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_subpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sub_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_subps256 ((__v8sf)__A, (__v8sf)__B); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_xor_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_xorpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_xor_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_xorps256 ((__v8sf)__A, (__v8sf)__B); -} - -#ifdef __OPTIMIZE__ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmp_pd (__m128d __X, __m128d __Y, const int __P) -{ - return (__m128d) __builtin_ia32_cmppd ((__v2df)__X, (__v2df)__Y, __P); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmp_ps (__m128 __X, __m128 __Y, const int __P) -{ - return (__m128) __builtin_ia32_cmpps ((__v4sf)__X, (__v4sf)__Y, __P); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cmp_pd (__m256d __X, __m256d __Y, const int __P) -{ - return (__m256d) __builtin_ia32_cmppd256 ((__v4df)__X, (__v4df)__Y, - __P); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cmp_ps (__m256 __X, __m256 __Y, const int __P) -{ - return (__m256) __builtin_ia32_cmpps256 ((__v8sf)__X, (__v8sf)__Y, - __P); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmp_sd (__m128d __X, __m128d __Y, const int __P) -{ - return (__m128d) __builtin_ia32_cmpsd ((__v2df)__X, (__v2df)__Y, __P); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmp_ss (__m128 __X, __m128 __Y, const int __P) -{ - return (__m128) __builtin_ia32_cmpss ((__v4sf)__X, (__v4sf)__Y, __P); -} -#else -#define _mm_cmp_pd(X, Y, P) \ - ((__m128d) __builtin_ia32_cmppd ((__v2df)(__m128d)(X), \ - (__v2df)(__m128d)(Y), (int)(P))) - -#define _mm_cmp_ps(X, Y, P) \ - ((__m128) __builtin_ia32_cmpps ((__v4sf)(__m128)(X), \ - (__v4sf)(__m128)(Y), (int)(P))) - -#define _mm256_cmp_pd(X, Y, P) \ - ((__m256d) __builtin_ia32_cmppd256 ((__v4df)(__m256d)(X), \ - (__v4df)(__m256d)(Y), (int)(P))) - -#define _mm256_cmp_ps(X, Y, P) \ - ((__m256) __builtin_ia32_cmpps256 ((__v8sf)(__m256)(X), \ - (__v8sf)(__m256)(Y), (int)(P))) - -#define _mm_cmp_sd(X, Y, P) \ - ((__m128d) __builtin_ia32_cmpsd ((__v2df)(__m128d)(X), \ - (__v2df)(__m128d)(Y), (int)(P))) - -#define _mm_cmp_ss(X, Y, P) \ - ((__m128) __builtin_ia32_cmpss ((__v4sf)(__m128)(X), \ - (__v4sf)(__m128)(Y), (int)(P))) -#endif - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepi32_pd (__m128i __A) -{ - return (__m256d)__builtin_ia32_cvtdq2pd256 ((__v4si) __A); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtepi32_ps (__m256i __A) -{ - return (__m256)__builtin_ia32_cvtdq2ps256 ((__v8si) __A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtpd_ps (__m256d __A) -{ - return (__m128)__builtin_ia32_cvtpd2ps256 ((__v4df) __A); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtps_epi32 (__m256 __A) -{ - return (__m256i)__builtin_ia32_cvtps2dq256 ((__v8sf) __A); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtps_pd (__m128 __A) -{ - return (__m256d)__builtin_ia32_cvtps2pd256 ((__v4sf) __A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvttpd_epi32 (__m256d __A) -{ - return (__m128i)__builtin_ia32_cvttpd2dq256 ((__v4df) __A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtpd_epi32 (__m256d __A) -{ - return (__m128i)__builtin_ia32_cvtpd2dq256 ((__v4df) __A); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvttps_epi32 (__m256 __A) -{ - return (__m256i)__builtin_ia32_cvttps2dq256 ((__v8sf) __A); -} - -#ifdef __OPTIMIZE__ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_extractf128_pd (__m256d __X, const int __N) -{ - return (__m128d) __builtin_ia32_vextractf128_pd256 ((__v4df)__X, __N); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_extractf128_ps (__m256 __X, const int __N) -{ - return (__m128) __builtin_ia32_vextractf128_ps256 ((__v8sf)__X, __N); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_extractf128_si256 (__m256i __X, const int __N) -{ - return (__m128i) __builtin_ia32_vextractf128_si256 ((__v8si)__X, __N); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_extract_epi32 (__m256i __X, int const __N) -{ - __m128i __Y = _mm256_extractf128_si256 (__X, __N >> 2); - return _mm_extract_epi32 (__Y, __N % 4); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_extract_epi16 (__m256i __X, int const __N) -{ - __m128i __Y = _mm256_extractf128_si256 (__X, __N >> 3); - return _mm_extract_epi16 (__Y, __N % 8); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_extract_epi8 (__m256i __X, int const __N) -{ - __m128i __Y = _mm256_extractf128_si256 (__X, __N >> 4); - return _mm_extract_epi8 (__Y, __N % 16); -} - -#ifdef __x86_64__ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_extract_epi64 (__m256i __X, const int __N) -{ - __m128i __Y = _mm256_extractf128_si256 (__X, __N >> 1); - return _mm_extract_epi64 (__Y, __N % 2); -} -#endif -#else -#define _mm256_extractf128_pd(X, N) \ - ((__m128d) __builtin_ia32_vextractf128_pd256 ((__v4df)(__m256d)(X), \ - (int)(N))) - -#define _mm256_extractf128_ps(X, N) \ - ((__m128) __builtin_ia32_vextractf128_ps256 ((__v8sf)(__m256)(X), \ - (int)(N))) - -#define _mm256_extractf128_si256(X, N) \ - ((__m128i) __builtin_ia32_vextractf128_si256 ((__v8si)(__m256i)(X), \ - (int)(N))) - -#define _mm256_extract_epi32(X, N) \ - (__extension__ \ - ({ \ - __m128i __Y = _mm256_extractf128_si256 ((X), (N) >> 2); \ - _mm_extract_epi32 (__Y, (N) % 4); \ - })) - -#define _mm256_extract_epi16(X, N) \ - (__extension__ \ - ({ \ - __m128i __Y = _mm256_extractf128_si256 ((X), (N) >> 3); \ - _mm_extract_epi16 (__Y, (N) % 8); \ - })) - -#define _mm256_extract_epi8(X, N) \ - (__extension__ \ - ({ \ - __m128i __Y = _mm256_extractf128_si256 ((X), (N) >> 4); \ - _mm_extract_epi8 (__Y, (N) % 16); \ - })) - -#ifdef __x86_64__ -#define _mm256_extract_epi64(X, N) \ - (__extension__ \ - ({ \ - __m128i __Y = _mm256_extractf128_si256 ((X), (N) >> 1); \ - _mm_extract_epi64 (__Y, (N) % 2); \ - })) -#endif -#endif - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_zeroall (void) -{ - __builtin_ia32_vzeroall (); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_zeroupper (void) -{ - __builtin_ia32_vzeroupper (); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_permutevar_pd (__m128d __A, __m128i __C) -{ - return (__m128d) __builtin_ia32_vpermilvarpd ((__v2df)__A, - (__v2di)__C); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permutevar_pd (__m256d __A, __m256i __C) -{ - return (__m256d) __builtin_ia32_vpermilvarpd256 ((__v4df)__A, - (__v4di)__C); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_permutevar_ps (__m128 __A, __m128i __C) -{ - return (__m128) __builtin_ia32_vpermilvarps ((__v4sf)__A, - (__v4si)__C); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permutevar_ps (__m256 __A, __m256i __C) -{ - return (__m256) __builtin_ia32_vpermilvarps256 ((__v8sf)__A, - (__v8si)__C); -} - -#ifdef __OPTIMIZE__ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_permute_pd (__m128d __X, const int __C) -{ - return (__m128d) __builtin_ia32_vpermilpd ((__v2df)__X, __C); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permute_pd (__m256d __X, const int __C) -{ - return (__m256d) __builtin_ia32_vpermilpd256 ((__v4df)__X, __C); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_permute_ps (__m128 __X, const int __C) -{ - return (__m128) __builtin_ia32_vpermilps ((__v4sf)__X, __C); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permute_ps (__m256 __X, const int __C) -{ - return (__m256) __builtin_ia32_vpermilps256 ((__v8sf)__X, __C); -} -#else -#define _mm_permute_pd(X, C) \ - ((__m128d) __builtin_ia32_vpermilpd ((__v2df)(__m128d)(X), (int)(C))) - -#define _mm256_permute_pd(X, C) \ - ((__m256d) __builtin_ia32_vpermilpd256 ((__v4df)(__m256d)(X), (int)(C))) - -#define _mm_permute_ps(X, C) \ - ((__m128) __builtin_ia32_vpermilps ((__v4sf)(__m128)(X), (int)(C))) - -#define _mm256_permute_ps(X, C) \ - ((__m256) __builtin_ia32_vpermilps256 ((__v8sf)(__m256)(X), (int)(C))) -#endif - -#ifdef __OPTIMIZE__ -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permute2f128_pd (__m256d __X, __m256d __Y, const int __C) -{ - return (__m256d) __builtin_ia32_vperm2f128_pd256 ((__v4df)__X, - (__v4df)__Y, - __C); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permute2f128_ps (__m256 __X, __m256 __Y, const int __C) -{ - return (__m256) __builtin_ia32_vperm2f128_ps256 ((__v8sf)__X, - (__v8sf)__Y, - __C); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permute2f128_si256 (__m256i __X, __m256i __Y, const int __C) -{ - return (__m256i) __builtin_ia32_vperm2f128_si256 ((__v8si)__X, - (__v8si)__Y, - __C); -} -#else -#define _mm256_permute2f128_pd(X, Y, C) \ - ((__m256d) __builtin_ia32_vperm2f128_pd256 ((__v4df)(__m256d)(X), \ - (__v4df)(__m256d)(Y), \ - (int)(C))) - -#define _mm256_permute2f128_ps(X, Y, C) \ - ((__m256) __builtin_ia32_vperm2f128_ps256 ((__v8sf)(__m256)(X), \ - (__v8sf)(__m256)(Y), \ - (int)(C))) - -#define _mm256_permute2f128_si256(X, Y, C) \ - ((__m256i) __builtin_ia32_vperm2f128_si256 ((__v8si)(__m256i)(X), \ - (__v8si)(__m256i)(Y), \ - (int)(C))) -#endif - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_broadcast_ss (float const *__X) -{ - return (__m128) __builtin_ia32_vbroadcastss (__X); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_broadcast_sd (double const *__X) -{ - return (__m256d) __builtin_ia32_vbroadcastsd256 (__X); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_broadcast_ss (float const *__X) -{ - return (__m256) __builtin_ia32_vbroadcastss256 (__X); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_broadcast_pd (__m128d const *__X) -{ - return (__m256d) __builtin_ia32_vbroadcastf128_pd256 (__X); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_broadcast_ps (__m128 const *__X) -{ - return (__m256) __builtin_ia32_vbroadcastf128_ps256 (__X); -} - -#ifdef __OPTIMIZE__ -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_insertf128_pd (__m256d __X, __m128d __Y, const int __O) -{ - return (__m256d) __builtin_ia32_vinsertf128_pd256 ((__v4df)__X, - (__v2df)__Y, - __O); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_insertf128_ps (__m256 __X, __m128 __Y, const int __O) -{ - return (__m256) __builtin_ia32_vinsertf128_ps256 ((__v8sf)__X, - (__v4sf)__Y, - __O); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_insertf128_si256 (__m256i __X, __m128i __Y, const int __O) -{ - return (__m256i) __builtin_ia32_vinsertf128_si256 ((__v8si)__X, - (__v4si)__Y, - __O); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_insert_epi32 (__m256i __X, int __D, int const __N) -{ - __m128i __Y = _mm256_extractf128_si256 (__X, __N >> 2); - __Y = _mm_insert_epi32 (__Y, __D, __N % 4); - return _mm256_insertf128_si256 (__X, __Y, __N >> 2); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_insert_epi16 (__m256i __X, int __D, int const __N) -{ - __m128i __Y = _mm256_extractf128_si256 (__X, __N >> 3); - __Y = _mm_insert_epi16 (__Y, __D, __N % 8); - return _mm256_insertf128_si256 (__X, __Y, __N >> 3); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_insert_epi8 (__m256i __X, int __D, int const __N) -{ - __m128i __Y = _mm256_extractf128_si256 (__X, __N >> 4); - __Y = _mm_insert_epi8 (__Y, __D, __N % 16); - return _mm256_insertf128_si256 (__X, __Y, __N >> 4); -} - -#ifdef __x86_64__ -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_insert_epi64 (__m256i __X, long long __D, int const __N) -{ - __m128i __Y = _mm256_extractf128_si256 (__X, __N >> 1); - __Y = _mm_insert_epi64 (__Y, __D, __N % 2); - return _mm256_insertf128_si256 (__X, __Y, __N >> 1); -} -#endif -#else -#define _mm256_insertf128_pd(X, Y, O) \ - ((__m256d) __builtin_ia32_vinsertf128_pd256 ((__v4df)(__m256d)(X), \ - (__v2df)(__m128d)(Y), \ - (int)(O))) - -#define _mm256_insertf128_ps(X, Y, O) \ - ((__m256) __builtin_ia32_vinsertf128_ps256 ((__v8sf)(__m256)(X), \ - (__v4sf)(__m128)(Y), \ - (int)(O))) - -#define _mm256_insertf128_si256(X, Y, O) \ - ((__m256i) __builtin_ia32_vinsertf128_si256 ((__v8si)(__m256i)(X), \ - (__v4si)(__m128i)(Y), \ - (int)(O))) - -#define _mm256_insert_epi32(X, D, N) \ - (__extension__ \ - ({ \ - __m128i __Y = _mm256_extractf128_si256 ((X), (N) >> 2); \ - __Y = _mm_insert_epi32 (__Y, (D), (N) % 4); \ - _mm256_insertf128_si256 ((X), __Y, (N) >> 2); \ - })) - -#define _mm256_insert_epi16(X, D, N) \ - (__extension__ \ - ({ \ - __m128i __Y = _mm256_extractf128_si256 ((X), (N) >> 3); \ - __Y = _mm_insert_epi16 (__Y, (D), (N) % 8); \ - _mm256_insertf128_si256 ((X), __Y, (N) >> 3); \ - })) - -#define _mm256_insert_epi8(X, D, N) \ - (__extension__ \ - ({ \ - __m128i __Y = _mm256_extractf128_si256 ((X), (N) >> 4); \ - __Y = _mm_insert_epi8 (__Y, (D), (N) % 16); \ - _mm256_insertf128_si256 ((X), __Y, (N) >> 4); \ - })) - -#ifdef __x86_64__ -#define _mm256_insert_epi64(X, D, N) \ - (__extension__ \ - ({ \ - __m128i __Y = _mm256_extractf128_si256 ((X), (N) >> 1); \ - __Y = _mm_insert_epi64 (__Y, (D), (N) % 2); \ - _mm256_insertf128_si256 ((X), __Y, (N) >> 1); \ - })) -#endif -#endif - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_load_pd (double const *__P) -{ - return *(__m256d *)__P; -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_store_pd (double *__P, __m256d __A) -{ - *(__m256d *)__P = __A; -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_load_ps (float const *__P) -{ - return *(__m256 *)__P; -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_store_ps (float *__P, __m256 __A) -{ - *(__m256 *)__P = __A; -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_loadu_pd (double const *__P) -{ - return (__m256d) __builtin_ia32_loadupd256 (__P); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_storeu_pd (double *__P, __m256d __A) -{ - __builtin_ia32_storeupd256 (__P, (__v4df)__A); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_loadu_ps (float const *__P) -{ - return (__m256) __builtin_ia32_loadups256 (__P); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_storeu_ps (float *__P, __m256 __A) -{ - __builtin_ia32_storeups256 (__P, (__v8sf)__A); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_load_si256 (__m256i const *__P) -{ - return *__P; -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_store_si256 (__m256i *__P, __m256i __A) -{ - *__P = __A; -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_loadu_si256 (__m256i const *__P) -{ - return (__m256i) __builtin_ia32_loaddqu256 ((char const *)__P); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_storeu_si256 (__m256i *__P, __m256i __A) -{ - __builtin_ia32_storedqu256 ((char *)__P, (__v32qi)__A); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskload_pd (double const *__P, __m128i __M) -{ - return (__m128d) __builtin_ia32_maskloadpd ((const __v2df *)__P, - (__v2di)__M); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskstore_pd (double *__P, __m128i __M, __m128d __A) -{ - __builtin_ia32_maskstorepd ((__v2df *)__P, (__v2di)__M, (__v2df)__A); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_maskload_pd (double const *__P, __m256i __M) -{ - return (__m256d) __builtin_ia32_maskloadpd256 ((const __v4df *)__P, - (__v4di)__M); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_maskstore_pd (double *__P, __m256i __M, __m256d __A) -{ - __builtin_ia32_maskstorepd256 ((__v4df *)__P, (__v4di)__M, (__v4df)__A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskload_ps (float const *__P, __m128i __M) -{ - return (__m128) __builtin_ia32_maskloadps ((const __v4sf *)__P, - (__v4si)__M); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskstore_ps (float *__P, __m128i __M, __m128 __A) -{ - __builtin_ia32_maskstoreps ((__v4sf *)__P, (__v4si)__M, (__v4sf)__A); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_maskload_ps (float const *__P, __m256i __M) -{ - return (__m256) __builtin_ia32_maskloadps256 ((const __v8sf *)__P, - (__v8si)__M); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_maskstore_ps (float *__P, __m256i __M, __m256 __A) -{ - __builtin_ia32_maskstoreps256 ((__v8sf *)__P, (__v8si)__M, (__v8sf)__A); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_movehdup_ps (__m256 __X) -{ - return (__m256) __builtin_ia32_movshdup256 ((__v8sf)__X); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_moveldup_ps (__m256 __X) -{ - return (__m256) __builtin_ia32_movsldup256 ((__v8sf)__X); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_movedup_pd (__m256d __X) -{ - return (__m256d) __builtin_ia32_movddup256 ((__v4df)__X); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_lddqu_si256 (__m256i const *__P) -{ - return (__m256i) __builtin_ia32_lddqu256 ((char const *)__P); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_stream_si256 (__m256i *__A, __m256i __B) -{ - __builtin_ia32_movntdq256 ((__v4di *)__A, (__v4di)__B); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_stream_pd (double *__A, __m256d __B) -{ - __builtin_ia32_movntpd256 (__A, (__v4df)__B); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_stream_ps (float *__P, __m256 __A) -{ - __builtin_ia32_movntps256 (__P, (__v8sf)__A); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_rcp_ps (__m256 __A) -{ - return (__m256) __builtin_ia32_rcpps256 ((__v8sf)__A); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_rsqrt_ps (__m256 __A) -{ - return (__m256) __builtin_ia32_rsqrtps256 ((__v8sf)__A); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sqrt_pd (__m256d __A) -{ - return (__m256d) __builtin_ia32_sqrtpd256 ((__v4df)__A); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_sqrt_ps (__m256 __A) -{ - return (__m256) __builtin_ia32_sqrtps256 ((__v8sf)__A); -} - -#ifdef __OPTIMIZE__ -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_round_pd (__m256d __V, const int __M) -{ - return (__m256d) __builtin_ia32_roundpd256 ((__v4df)__V, __M); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_round_ps (__m256 __V, const int __M) -{ - return (__m256) __builtin_ia32_roundps256 ((__v8sf)__V, __M); -} -#else -#define _mm256_round_pd(V, M) \ - ((__m256d) __builtin_ia32_roundpd256 ((__v4df)(__m256d)(V), (int)(M))) - -#define _mm256_round_ps(V, M) \ - ((__m256) __builtin_ia32_roundps256 ((__v8sf)(__m256)(V), (int)(M))) -#endif - -#define _mm256_ceil_pd(V) _mm256_round_pd ((V), _MM_FROUND_CEIL) -#define _mm256_floor_pd(V) _mm256_round_pd ((V), _MM_FROUND_FLOOR) -#define _mm256_ceil_ps(V) _mm256_round_ps ((V), _MM_FROUND_CEIL) -#define _mm256_floor_ps(V) _mm256_round_ps ((V), _MM_FROUND_FLOOR) - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpackhi_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_unpckhpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpacklo_pd (__m256d __A, __m256d __B) -{ - return (__m256d) __builtin_ia32_unpcklpd256 ((__v4df)__A, (__v4df)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpackhi_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_unpckhps256 ((__v8sf)__A, (__v8sf)__B); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_unpacklo_ps (__m256 __A, __m256 __B) -{ - return (__m256) __builtin_ia32_unpcklps256 ((__v8sf)__A, (__v8sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_testz_pd (__m128d __M, __m128d __V) -{ - return __builtin_ia32_vtestzpd ((__v2df)__M, (__v2df)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_testc_pd (__m128d __M, __m128d __V) -{ - return __builtin_ia32_vtestcpd ((__v2df)__M, (__v2df)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_testnzc_pd (__m128d __M, __m128d __V) -{ - return __builtin_ia32_vtestnzcpd ((__v2df)__M, (__v2df)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_testz_ps (__m128 __M, __m128 __V) -{ - return __builtin_ia32_vtestzps ((__v4sf)__M, (__v4sf)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_testc_ps (__m128 __M, __m128 __V) -{ - return __builtin_ia32_vtestcps ((__v4sf)__M, (__v4sf)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_testnzc_ps (__m128 __M, __m128 __V) -{ - return __builtin_ia32_vtestnzcps ((__v4sf)__M, (__v4sf)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_testz_pd (__m256d __M, __m256d __V) -{ - return __builtin_ia32_vtestzpd256 ((__v4df)__M, (__v4df)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_testc_pd (__m256d __M, __m256d __V) -{ - return __builtin_ia32_vtestcpd256 ((__v4df)__M, (__v4df)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_testnzc_pd (__m256d __M, __m256d __V) -{ - return __builtin_ia32_vtestnzcpd256 ((__v4df)__M, (__v4df)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_testz_ps (__m256 __M, __m256 __V) -{ - return __builtin_ia32_vtestzps256 ((__v8sf)__M, (__v8sf)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_testc_ps (__m256 __M, __m256 __V) -{ - return __builtin_ia32_vtestcps256 ((__v8sf)__M, (__v8sf)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_testnzc_ps (__m256 __M, __m256 __V) -{ - return __builtin_ia32_vtestnzcps256 ((__v8sf)__M, (__v8sf)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_testz_si256 (__m256i __M, __m256i __V) -{ - return __builtin_ia32_ptestz256 ((__v4di)__M, (__v4di)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_testc_si256 (__m256i __M, __m256i __V) -{ - return __builtin_ia32_ptestc256 ((__v4di)__M, (__v4di)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_testnzc_si256 (__m256i __M, __m256i __V) -{ - return __builtin_ia32_ptestnzc256 ((__v4di)__M, (__v4di)__V); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_movemask_pd (__m256d __A) -{ - return __builtin_ia32_movmskpd256 ((__v4df)__A); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_movemask_ps (__m256 __A) -{ - return __builtin_ia32_movmskps256 ((__v8sf)__A); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_setzero_pd (void) -{ - return __extension__ (__m256d){ 0.0, 0.0, 0.0, 0.0 }; -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_setzero_ps (void) -{ - return __extension__ (__m256){ 0.0, 0.0, 0.0, 0.0, - 0.0, 0.0, 0.0, 0.0 }; -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_setzero_si256 (void) -{ - return __extension__ (__m256i)(__v4di){ 0, 0, 0, 0 }; -} - -/* Create the vector [A B C D]. */ -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set_pd (double __A, double __B, double __C, double __D) -{ - return __extension__ (__m256d){ __D, __C, __B, __A }; -} - -/* Create the vector [A B C D E F G H]. */ -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set_ps (float __A, float __B, float __C, float __D, - float __E, float __F, float __G, float __H) -{ - return __extension__ (__m256){ __H, __G, __F, __E, - __D, __C, __B, __A }; -} - -/* Create the vector [A B C D E F G H]. */ -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set_epi32 (int __A, int __B, int __C, int __D, - int __E, int __F, int __G, int __H) -{ - return __extension__ (__m256i)(__v8si){ __H, __G, __F, __E, - __D, __C, __B, __A }; -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set_epi16 (short __q15, short __q14, short __q13, short __q12, - short __q11, short __q10, short __q09, short __q08, - short __q07, short __q06, short __q05, short __q04, - short __q03, short __q02, short __q01, short __q00) -{ - return __extension__ (__m256i)(__v16hi){ - __q00, __q01, __q02, __q03, __q04, __q05, __q06, __q07, - __q08, __q09, __q10, __q11, __q12, __q13, __q14, __q15 - }; -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set_epi8 (char __q31, char __q30, char __q29, char __q28, - char __q27, char __q26, char __q25, char __q24, - char __q23, char __q22, char __q21, char __q20, - char __q19, char __q18, char __q17, char __q16, - char __q15, char __q14, char __q13, char __q12, - char __q11, char __q10, char __q09, char __q08, - char __q07, char __q06, char __q05, char __q04, - char __q03, char __q02, char __q01, char __q00) -{ - return __extension__ (__m256i)(__v32qi){ - __q00, __q01, __q02, __q03, __q04, __q05, __q06, __q07, - __q08, __q09, __q10, __q11, __q12, __q13, __q14, __q15, - __q16, __q17, __q18, __q19, __q20, __q21, __q22, __q23, - __q24, __q25, __q26, __q27, __q28, __q29, __q30, __q31 - }; -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set_epi64x (long long __A, long long __B, long long __C, - long long __D) -{ - return __extension__ (__m256i)(__v4di){ __D, __C, __B, __A }; -} - -/* Create a vector with all elements equal to A. */ -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set1_pd (double __A) -{ - return __extension__ (__m256d){ __A, __A, __A, __A }; -} - -/* Create a vector with all elements equal to A. */ -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set1_ps (float __A) -{ - return __extension__ (__m256){ __A, __A, __A, __A, - __A, __A, __A, __A }; -} - -/* Create a vector with all elements equal to A. */ -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set1_epi32 (int __A) -{ - return __extension__ (__m256i)(__v8si){ __A, __A, __A, __A, - __A, __A, __A, __A }; -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set1_epi16 (short __A) -{ - return _mm256_set_epi16 (__A, __A, __A, __A, __A, __A, __A, __A, - __A, __A, __A, __A, __A, __A, __A, __A); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set1_epi8 (char __A) -{ - return _mm256_set_epi8 (__A, __A, __A, __A, __A, __A, __A, __A, - __A, __A, __A, __A, __A, __A, __A, __A, - __A, __A, __A, __A, __A, __A, __A, __A, - __A, __A, __A, __A, __A, __A, __A, __A); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_set1_epi64x (long long __A) -{ - return __extension__ (__m256i)(__v4di){ __A, __A, __A, __A }; -} - -/* Create vectors of elements in the reversed order from the - _mm256_set_XXX functions. */ - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_setr_pd (double __A, double __B, double __C, double __D) -{ - return _mm256_set_pd (__D, __C, __B, __A); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_setr_ps (float __A, float __B, float __C, float __D, - float __E, float __F, float __G, float __H) -{ - return _mm256_set_ps (__H, __G, __F, __E, __D, __C, __B, __A); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_setr_epi32 (int __A, int __B, int __C, int __D, - int __E, int __F, int __G, int __H) -{ - return _mm256_set_epi32 (__H, __G, __F, __E, __D, __C, __B, __A); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_setr_epi16 (short __q15, short __q14, short __q13, short __q12, - short __q11, short __q10, short __q09, short __q08, - short __q07, short __q06, short __q05, short __q04, - short __q03, short __q02, short __q01, short __q00) -{ - return _mm256_set_epi16 (__q00, __q01, __q02, __q03, - __q04, __q05, __q06, __q07, - __q08, __q09, __q10, __q11, - __q12, __q13, __q14, __q15); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_setr_epi8 (char __q31, char __q30, char __q29, char __q28, - char __q27, char __q26, char __q25, char __q24, - char __q23, char __q22, char __q21, char __q20, - char __q19, char __q18, char __q17, char __q16, - char __q15, char __q14, char __q13, char __q12, - char __q11, char __q10, char __q09, char __q08, - char __q07, char __q06, char __q05, char __q04, - char __q03, char __q02, char __q01, char __q00) -{ - return _mm256_set_epi8 (__q00, __q01, __q02, __q03, - __q04, __q05, __q06, __q07, - __q08, __q09, __q10, __q11, - __q12, __q13, __q14, __q15, - __q16, __q17, __q18, __q19, - __q20, __q21, __q22, __q23, - __q24, __q25, __q26, __q27, - __q28, __q29, __q30, __q31); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_setr_epi64x (long long __A, long long __B, long long __C, - long long __D) -{ - return _mm256_set_epi64x (__D, __C, __B, __A); -} - -/* Casts between various SP, DP, INT vector types. Note that these do no - conversion of values, they just change the type. */ -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castpd_ps (__m256d __A) -{ - return (__m256) __A; -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castpd_si256 (__m256d __A) -{ - return (__m256i) __A; -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castps_pd (__m256 __A) -{ - return (__m256d) __A; -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castps_si256(__m256 __A) -{ - return (__m256i) __A; -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castsi256_ps (__m256i __A) -{ - return (__m256) __A; -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castsi256_pd (__m256i __A) -{ - return (__m256d) __A; -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castpd256_pd128 (__m256d __A) -{ - return (__m128d) __builtin_ia32_pd_pd256 ((__v4df)__A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castps256_ps128 (__m256 __A) -{ - return (__m128) __builtin_ia32_ps_ps256 ((__v8sf)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castsi256_si128 (__m256i __A) -{ - return (__m128i) __builtin_ia32_si_si256 ((__v8si)__A); -} - -/* When cast is done from a 128 to 256-bit type, the low 128 bits of - the 256-bit result contain source parameter value and the upper 128 - bits of the result are undefined. Those intrinsics shouldn't - generate any extra moves. */ - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castpd128_pd256 (__m128d __A) -{ - return (__m256d) __builtin_ia32_pd256_pd ((__v2df)__A); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castps128_ps256 (__m128 __A) -{ - return (__m256) __builtin_ia32_ps256_ps ((__v4sf)__A); -} - -extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_castsi128_si256 (__m128i __A) -{ - return (__m256i) __builtin_ia32_si256_si ((__v4si)__A); -} diff --git a/lib/gcc/x86_64-linux-android/4.8/include/bmi2intrin.h b/lib/gcc/x86_64-linux-android/4.8/include/bmi2intrin.h deleted file mode 100644 index 929ea20..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/bmi2intrin.h +++ /dev/null @@ -1,102 +0,0 @@ -/* Copyright (C) 2011-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#if !defined _X86INTRIN_H_INCLUDED && !defined _IMMINTRIN_H_INCLUDED -# error "Never use <bmi2intrin.h> directly; include <x86intrin.h> instead." -#endif - -#ifndef __BMI2__ -# error "BMI2 instruction set not enabled" -#endif /* __BMI2__ */ - -#ifndef _BMI2INTRIN_H_INCLUDED -#define _BMI2INTRIN_H_INCLUDED - -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_bzhi_u32 (unsigned int __X, unsigned int __Y) -{ - return __builtin_ia32_bzhi_si (__X, __Y); -} - -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_pdep_u32 (unsigned int __X, unsigned int __Y) -{ - return __builtin_ia32_pdep_si (__X, __Y); -} - -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_pext_u32 (unsigned int __X, unsigned int __Y) -{ - return __builtin_ia32_pext_si (__X, __Y); -} - -#ifdef __x86_64__ - -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_bzhi_u64 (unsigned long long __X, unsigned long long __Y) -{ - return __builtin_ia32_bzhi_di (__X, __Y); -} - -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_pdep_u64 (unsigned long long __X, unsigned long long __Y) -{ - return __builtin_ia32_pdep_di (__X, __Y); -} - -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_pext_u64 (unsigned long long __X, unsigned long long __Y) -{ - return __builtin_ia32_pext_di (__X, __Y); -} - -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mulx_u64 (unsigned long long __X, unsigned long long __Y, - unsigned long long *__P) -{ - unsigned __int128 __res = (unsigned __int128) __X * __Y; - *__P = (unsigned long long) (__res >> 64); - return (unsigned long long) __res; -} - -#else /* !__x86_64__ */ - -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mulx_u32 (unsigned int __X, unsigned int __Y, unsigned int *__P) -{ - unsigned long long __res = (unsigned long long) __X * __Y; - *__P = (unsigned int) (__res >> 32); - return (unsigned int) __res; -} - -#endif /* !__x86_64__ */ - -#endif /* _BMI2INTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/bmiintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/bmiintrin.h deleted file mode 100644 index fc7f2ec..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/bmiintrin.h +++ /dev/null @@ -1,177 +0,0 @@ -/* Copyright (C) 2010-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#if !defined _X86INTRIN_H_INCLUDED && !defined _IMMINTRIN_H_INCLUDED -# error "Never use <bmiintrin.h> directly; include <x86intrin.h> instead." -#endif - -#ifndef __BMI__ -# error "BMI instruction set not enabled" -#endif /* __BMI__ */ - -#ifndef _BMIINTRIN_H_INCLUDED -#define _BMIINTRIN_H_INCLUDED - -extern __inline unsigned short __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__tzcnt_u16 (unsigned short __X) -{ - return __builtin_ctzs (__X); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__andn_u32 (unsigned int __X, unsigned int __Y) -{ - return ~__X & __Y; -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__bextr_u32 (unsigned int __X, unsigned int __Y) -{ - return __builtin_ia32_bextr_u32 (__X, __Y); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_bextr_u32 (unsigned int __X, unsigned int __Y, unsigned __Z) -{ - return __builtin_ia32_bextr_u32 (__X, ((__Y & 0xff) | ((__Z & 0xff) << 8))); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blsi_u32 (unsigned int __X) -{ - return __X & -__X; -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_blsi_u32 (unsigned int __X) -{ - return __blsi_u32 (__X); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blsmsk_u32 (unsigned int __X) -{ - return __X ^ (__X - 1); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_blsmsk_u32 (unsigned int __X) -{ - return __blsmsk_u32 (__X); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blsr_u32 (unsigned int __X) -{ - return __X & (__X - 1); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_blsr_u32 (unsigned int __X) -{ - return __blsr_u32 (__X); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__tzcnt_u32 (unsigned int __X) -{ - return __builtin_ctz (__X); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_tzcnt_u32 (unsigned int __X) -{ - return __builtin_ctz (__X); -} - - -#ifdef __x86_64__ -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__andn_u64 (unsigned long long __X, unsigned long long __Y) -{ - return ~__X & __Y; -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__bextr_u64 (unsigned long long __X, unsigned long long __Y) -{ - return __builtin_ia32_bextr_u64 (__X, __Y); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_bextr_u64 (unsigned long long __X, unsigned int __Y, unsigned int __Z) -{ - return __builtin_ia32_bextr_u64 (__X, ((__Y & 0xff) | ((__Z & 0xff) << 8))); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blsi_u64 (unsigned long long __X) -{ - return __X & -__X; -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_blsi_u64 (unsigned long long __X) -{ - return __blsi_u64 (__X); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blsmsk_u64 (unsigned long long __X) -{ - return __X ^ (__X - 1); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_blsmsk_u64 (unsigned long long __X) -{ - return __blsmsk_u64 (__X); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blsr_u64 (unsigned long long __X) -{ - return __X & (__X - 1); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_blsr_u64 (unsigned long long __X) -{ - return __blsr_u64 (__X); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__tzcnt_u64 (unsigned long long __X) -{ - return __builtin_ctzll (__X); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_tzcnt_u64 (unsigned long long __X) -{ - return __builtin_ctzll (__X); -} - -#endif /* __x86_64__ */ - -#endif /* _BMIINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/bmmintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/bmmintrin.h deleted file mode 100644 index 9d68cec..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/bmmintrin.h +++ /dev/null @@ -1,29 +0,0 @@ -/* Copyright (C) 2007-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _BMMINTRIN_H_INCLUDED -#define _BMMINTRIN_H_INCLUDED - -# error "SSE5 instruction set removed from compiler" - -#endif /* _BMMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/cpuid.h b/lib/gcc/x86_64-linux-android/4.8/include/cpuid.h deleted file mode 100644 index c1e1eba..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/cpuid.h +++ /dev/null @@ -1,268 +0,0 @@ -/* - * Copyright (C) 2007-2013 Free Software Foundation, Inc. - * - * This file is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License as published by the - * Free Software Foundation; either version 3, or (at your option) any - * later version. - * - * This file is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * Under Section 7 of GPL version 3, you are granted additional - * permissions described in the GCC Runtime Library Exception, version - * 3.1, as published by the Free Software Foundation. - * - * You should have received a copy of the GNU General Public License and - * a copy of the GCC Runtime Library Exception along with this program; - * see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - * <http://www.gnu.org/licenses/>. - */ - -/* %ecx */ -#define bit_SSE3 (1 << 0) -#define bit_PCLMUL (1 << 1) -#define bit_LZCNT (1 << 5) -#define bit_SSSE3 (1 << 9) -#define bit_FMA (1 << 12) -#define bit_CMPXCHG16B (1 << 13) -#define bit_SSE4_1 (1 << 19) -#define bit_SSE4_2 (1 << 20) -#define bit_MOVBE (1 << 22) -#define bit_POPCNT (1 << 23) -#define bit_AES (1 << 25) -#define bit_XSAVE (1 << 26) -#define bit_OSXSAVE (1 << 27) -#define bit_AVX (1 << 28) -#define bit_F16C (1 << 29) -#define bit_RDRND (1 << 30) - -/* %edx */ -#define bit_CMPXCHG8B (1 << 8) -#define bit_CMOV (1 << 15) -#define bit_MMX (1 << 23) -#define bit_FXSAVE (1 << 24) -#define bit_SSE (1 << 25) -#define bit_SSE2 (1 << 26) - -/* Extended Features */ -/* %ecx */ -#define bit_LAHF_LM (1 << 0) -#define bit_ABM (1 << 5) -#define bit_SSE4a (1 << 6) -#define bit_PRFCHW (1 << 8) -#define bit_XOP (1 << 11) -#define bit_LWP (1 << 15) -#define bit_FMA4 (1 << 16) -#define bit_TBM (1 << 21) - -/* %edx */ -#define bit_MMXEXT (1 << 22) -#define bit_LM (1 << 29) -#define bit_3DNOWP (1 << 30) -#define bit_3DNOW (1 << 31) - -/* Extended Features (%eax == 7) */ -#define bit_FSGSBASE (1 << 0) -#define bit_BMI (1 << 3) -#define bit_HLE (1 << 4) -#define bit_AVX2 (1 << 5) -#define bit_BMI2 (1 << 8) -#define bit_RTM (1 << 11) -#define bit_RDSEED (1 << 18) -#define bit_ADX (1 << 19) - -/* Extended State Enumeration Sub-leaf (%eax == 13, %ecx == 1) */ -#define bit_XSAVEOPT (1 << 0) - -/* Signatures for different CPU implementations as returned in uses - of cpuid with level 0. */ -#define signature_AMD_ebx 0x68747541 -#define signature_AMD_ecx 0x444d4163 -#define signature_AMD_edx 0x69746e65 - -#define signature_CENTAUR_ebx 0x746e6543 -#define signature_CENTAUR_ecx 0x736c7561 -#define signature_CENTAUR_edx 0x48727561 - -#define signature_CYRIX_ebx 0x69727943 -#define signature_CYRIX_ecx 0x64616574 -#define signature_CYRIX_edx 0x736e4978 - -#define signature_INTEL_ebx 0x756e6547 -#define signature_INTEL_ecx 0x6c65746e -#define signature_INTEL_edx 0x49656e69 - -#define signature_TM1_ebx 0x6e617254 -#define signature_TM1_ecx 0x55504361 -#define signature_TM1_edx 0x74656d73 - -#define signature_TM2_ebx 0x756e6547 -#define signature_TM2_ecx 0x3638784d -#define signature_TM2_edx 0x54656e69 - -#define signature_NSC_ebx 0x646f6547 -#define signature_NSC_ecx 0x43534e20 -#define signature_NSC_edx 0x79622065 - -#define signature_NEXGEN_ebx 0x4778654e -#define signature_NEXGEN_ecx 0x6e657669 -#define signature_NEXGEN_edx 0x72446e65 - -#define signature_RISE_ebx 0x65736952 -#define signature_RISE_ecx 0x65736952 -#define signature_RISE_edx 0x65736952 - -#define signature_SIS_ebx 0x20536953 -#define signature_SIS_ecx 0x20536953 -#define signature_SIS_edx 0x20536953 - -#define signature_UMC_ebx 0x20434d55 -#define signature_UMC_ecx 0x20434d55 -#define signature_UMC_edx 0x20434d55 - -#define signature_VIA_ebx 0x20414956 -#define signature_VIA_ecx 0x20414956 -#define signature_VIA_edx 0x20414956 - -#define signature_VORTEX_ebx 0x74726f56 -#define signature_VORTEX_ecx 0x436f5320 -#define signature_VORTEX_edx 0x36387865 - -#if defined(__i386__) && defined(__PIC__) -/* %ebx may be the PIC register. */ -#if __GNUC__ >= 3 -#define __cpuid(level, a, b, c, d) \ - __asm__ ("xchg{l}\t{%%}ebx, %k1\n\t" \ - "cpuid\n\t" \ - "xchg{l}\t{%%}ebx, %k1\n\t" \ - : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \ - : "0" (level)) - -#define __cpuid_count(level, count, a, b, c, d) \ - __asm__ ("xchg{l}\t{%%}ebx, %k1\n\t" \ - "cpuid\n\t" \ - "xchg{l}\t{%%}ebx, %k1\n\t" \ - : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \ - : "0" (level), "2" (count)) -#else -/* Host GCCs older than 3.0 weren't supporting Intel asm syntax - nor alternatives in i386 code. */ -#define __cpuid(level, a, b, c, d) \ - __asm__ ("xchgl\t%%ebx, %k1\n\t" \ - "cpuid\n\t" \ - "xchgl\t%%ebx, %k1\n\t" \ - : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \ - : "0" (level)) - -#define __cpuid_count(level, count, a, b, c, d) \ - __asm__ ("xchgl\t%%ebx, %k1\n\t" \ - "cpuid\n\t" \ - "xchgl\t%%ebx, %k1\n\t" \ - : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \ - : "0" (level), "2" (count)) -#endif -#elif defined(__x86_64__) && (defined(__code_model_medium__) || defined(__code_model_large__)) && defined(__PIC__) -/* %rbx may be the PIC register. */ -#define __cpuid(level, a, b, c, d) \ - __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \ - "cpuid\n\t" \ - "xchg{q}\t{%%}rbx, %q1\n\t" \ - : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \ - : "0" (level)) - -#define __cpuid_count(level, count, a, b, c, d) \ - __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \ - "cpuid\n\t" \ - "xchg{q}\t{%%}rbx, %q1\n\t" \ - : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \ - : "0" (level), "2" (count)) -#else -#define __cpuid(level, a, b, c, d) \ - __asm__ ("cpuid\n\t" \ - : "=a" (a), "=b" (b), "=c" (c), "=d" (d) \ - : "0" (level)) - -#define __cpuid_count(level, count, a, b, c, d) \ - __asm__ ("cpuid\n\t" \ - : "=a" (a), "=b" (b), "=c" (c), "=d" (d) \ - : "0" (level), "2" (count)) -#endif - -/* Return highest supported input value for cpuid instruction. ext can - be either 0x0 or 0x8000000 to return highest supported value for - basic or extended cpuid information. Function returns 0 if cpuid - is not supported or whatever cpuid returns in eax register. If sig - pointer is non-null, then first four bytes of the signature - (as found in ebx register) are returned in location pointed by sig. */ - -static __inline unsigned int -__get_cpuid_max (unsigned int __ext, unsigned int *__sig) -{ - unsigned int __eax, __ebx, __ecx, __edx; - -#ifndef __x86_64__ - /* See if we can use cpuid. On AMD64 we always can. */ -#if __GNUC__ >= 3 - __asm__ ("pushf{l|d}\n\t" - "pushf{l|d}\n\t" - "pop{l}\t%0\n\t" - "mov{l}\t{%0, %1|%1, %0}\n\t" - "xor{l}\t{%2, %0|%0, %2}\n\t" - "push{l}\t%0\n\t" - "popf{l|d}\n\t" - "pushf{l|d}\n\t" - "pop{l}\t%0\n\t" - "popf{l|d}\n\t" - : "=&r" (__eax), "=&r" (__ebx) - : "i" (0x00200000)); -#else -/* Host GCCs older than 3.0 weren't supporting Intel asm syntax - nor alternatives in i386 code. */ - __asm__ ("pushfl\n\t" - "pushfl\n\t" - "popl\t%0\n\t" - "movl\t%0, %1\n\t" - "xorl\t%2, %0\n\t" - "pushl\t%0\n\t" - "popfl\n\t" - "pushfl\n\t" - "popl\t%0\n\t" - "popfl\n\t" - : "=&r" (__eax), "=&r" (__ebx) - : "i" (0x00200000)); -#endif - - if (!((__eax ^ __ebx) & 0x00200000)) - return 0; -#endif - - /* Host supports cpuid. Return highest supported cpuid input value. */ - __cpuid (__ext, __eax, __ebx, __ecx, __edx); - - if (__sig) - *__sig = __ebx; - - return __eax; -} - -/* Return cpuid data for requested cpuid level, as found in returned - eax, ebx, ecx and edx registers. The function checks if cpuid is - supported and returns 1 for valid cpuid information or 0 for - unsupported cpuid level. All pointers are required to be non-null. */ - -static __inline int -__get_cpuid (unsigned int __level, - unsigned int *__eax, unsigned int *__ebx, - unsigned int *__ecx, unsigned int *__edx) -{ - unsigned int __ext = __level & 0x80000000; - - if (__get_cpuid_max (__ext, 0) < __level) - return 0; - - __cpuid (__level, *__eax, *__ebx, *__ecx, *__edx); - return 1; -} diff --git a/lib/gcc/x86_64-linux-android/4.8/include/cross-stdarg.h b/lib/gcc/x86_64-linux-android/4.8/include/cross-stdarg.h deleted file mode 100644 index f934cf0..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/cross-stdarg.h +++ /dev/null @@ -1,72 +0,0 @@ -/* Copyright (C) 2002-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef __CROSS_STDARG_H_INCLUDED -#define __CROSS_STDARG_H_INCLUDED - -/* Make sure that for non x64 targets cross builtins are defined. */ -#ifndef __x86_64__ -/* Call abi ms_abi. */ -#define __builtin_ms_va_list __builtin_va_list -#define __builtin_ms_va_copy __builtin_va_copy -#define __builtin_ms_va_start __builtin_va_start -#define __builtin_ms_va_end __builtin_va_end - -/* Call abi sysv_abi. */ -#define __builtin_sysv_va_list __builtin_va_list -#define __builtin_sysv_va_copy __builtin_va_copy -#define __builtin_sysv_va_start __builtin_va_start -#define __builtin_sysv_va_end __builtin_va_end -#endif - -#define __ms_va_copy(__d,__s) __builtin_ms_va_copy(__d,__s) -#define __ms_va_start(__v,__l) __builtin_ms_va_start(__v,__l) -#define __ms_va_arg(__v,__l) __builtin_va_arg(__v,__l) -#define __ms_va_end(__v) __builtin_ms_va_end(__v) - -#define __sysv_va_copy(__d,__s) __builtin_sysv_va_copy(__d,__s) -#define __sysv_va_start(__v,__l) __builtin_sysv_va_start(__v,__l) -#define __sysv_va_arg(__v,__l) __builtin_va_arg(__v,__l) -#define __sysv_va_end(__v) __builtin_sysv_va_end(__v) - -#ifndef __GNUC_SYSV_VA_LIST -#define __GNUC_SYSV_VA_LIST - typedef __builtin_sysv_va_list __gnuc_sysv_va_list; -#endif - -#ifndef _SYSV_VA_LIST_DEFINED -#define _SYSV_VA_LIST_DEFINED - typedef __gnuc_sysv_va_list sysv_va_list; -#endif - -#ifndef __GNUC_MS_VA_LIST -#define __GNUC_MS_VA_LIST - typedef __builtin_ms_va_list __gnuc_ms_va_list; -#endif - -#ifndef _MS_VA_LIST_DEFINED -#define _MS_VA_LIST_DEFINED - typedef __gnuc_ms_va_list ms_va_list; -#endif - -#endif /* __CROSS_STDARG_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/emmintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/emmintrin.h deleted file mode 100644 index cf404a1..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/emmintrin.h +++ /dev/null @@ -1,1520 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* Implemented from the specification included in the Intel C++ Compiler - User Guide and Reference, version 9.0. */ - -#ifndef _EMMINTRIN_H_INCLUDED -#define _EMMINTRIN_H_INCLUDED - -#ifndef __SSE2__ -# error "SSE2 instruction set not enabled" -#else - -/* We need definitions from the SSE header files*/ -#include <xmmintrin.h> - -/* SSE2 */ -typedef double __v2df __attribute__ ((__vector_size__ (16))); -typedef long long __v2di __attribute__ ((__vector_size__ (16))); -typedef int __v4si __attribute__ ((__vector_size__ (16))); -typedef short __v8hi __attribute__ ((__vector_size__ (16))); -typedef char __v16qi __attribute__ ((__vector_size__ (16))); - -/* The Intel API is flexible enough that we must allow aliasing with other - vector types, and their scalar components. */ -typedef long long __m128i __attribute__ ((__vector_size__ (16), __may_alias__)); -typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__)); - -/* Create a selector for use with the SHUFPD instruction. */ -#define _MM_SHUFFLE2(fp1,fp0) \ - (((fp1) << 1) | (fp0)) - -/* Create a vector with element 0 as F and the rest zero. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_sd (double __F) -{ - return __extension__ (__m128d){ __F, 0.0 }; -} - -/* Create a vector with both elements equal to F. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set1_pd (double __F) -{ - return __extension__ (__m128d){ __F, __F }; -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_pd1 (double __F) -{ - return _mm_set1_pd (__F); -} - -/* Create a vector with the lower value X and upper value W. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_pd (double __W, double __X) -{ - return __extension__ (__m128d){ __X, __W }; -} - -/* Create a vector with the lower value W and upper value X. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setr_pd (double __W, double __X) -{ - return __extension__ (__m128d){ __W, __X }; -} - -/* Create a vector of zeros. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setzero_pd (void) -{ - return __extension__ (__m128d){ 0.0, 0.0 }; -} - -/* Sets the low DPFP value of A from the low value of B. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_move_sd (__m128d __A, __m128d __B) -{ - return (__m128d) __builtin_ia32_movsd ((__v2df)__A, (__v2df)__B); -} - -/* Load two DPFP values from P. The address must be 16-byte aligned. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_load_pd (double const *__P) -{ - return *(__m128d *)__P; -} - -/* Load two DPFP values from P. The address need not be 16-byte aligned. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_loadu_pd (double const *__P) -{ - return __builtin_ia32_loadupd (__P); -} - -/* Create a vector with all two elements equal to *P. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_load1_pd (double const *__P) -{ - return _mm_set1_pd (*__P); -} - -/* Create a vector with element 0 as *P and the rest zero. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_load_sd (double const *__P) -{ - return _mm_set_sd (*__P); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_load_pd1 (double const *__P) -{ - return _mm_load1_pd (__P); -} - -/* Load two DPFP values in reverse order. The address must be aligned. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_loadr_pd (double const *__P) -{ - __m128d __tmp = _mm_load_pd (__P); - return __builtin_ia32_shufpd (__tmp, __tmp, _MM_SHUFFLE2 (0,1)); -} - -/* Store two DPFP values. The address must be 16-byte aligned. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_store_pd (double *__P, __m128d __A) -{ - *(__m128d *)__P = __A; -} - -/* Store two DPFP values. The address need not be 16-byte aligned. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_storeu_pd (double *__P, __m128d __A) -{ - __builtin_ia32_storeupd (__P, __A); -} - -/* Stores the lower DPFP value. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_store_sd (double *__P, __m128d __A) -{ - *__P = __builtin_ia32_vec_ext_v2df (__A, 0); -} - -extern __inline double __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsd_f64 (__m128d __A) -{ - return __builtin_ia32_vec_ext_v2df (__A, 0); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_storel_pd (double *__P, __m128d __A) -{ - _mm_store_sd (__P, __A); -} - -/* Stores the upper DPFP value. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_storeh_pd (double *__P, __m128d __A) -{ - *__P = __builtin_ia32_vec_ext_v2df (__A, 1); -} - -/* Store the lower DPFP value across two words. - The address must be 16-byte aligned. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_store1_pd (double *__P, __m128d __A) -{ - _mm_store_pd (__P, __builtin_ia32_shufpd (__A, __A, _MM_SHUFFLE2 (0,0))); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_store_pd1 (double *__P, __m128d __A) -{ - _mm_store1_pd (__P, __A); -} - -/* Store two DPFP values in reverse order. The address must be aligned. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_storer_pd (double *__P, __m128d __A) -{ - _mm_store_pd (__P, __builtin_ia32_shufpd (__A, __A, _MM_SHUFFLE2 (0,1))); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi128_si32 (__m128i __A) -{ - return __builtin_ia32_vec_ext_v4si ((__v4si)__A, 0); -} - -#ifdef __x86_64__ -/* Intel intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi128_si64 (__m128i __A) -{ - return __builtin_ia32_vec_ext_v2di ((__v2di)__A, 0); -} - -/* Microsoft intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi128_si64x (__m128i __A) -{ - return __builtin_ia32_vec_ext_v2di ((__v2di)__A, 0); -} -#endif - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_addpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_addsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_subpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_subsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mul_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_mulpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mul_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_mulsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_div_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_divpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_div_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_divsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sqrt_pd (__m128d __A) -{ - return (__m128d)__builtin_ia32_sqrtpd ((__v2df)__A); -} - -/* Return pair {sqrt (A[0), B[1]}. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sqrt_sd (__m128d __A, __m128d __B) -{ - __v2df __tmp = __builtin_ia32_movsd ((__v2df)__A, (__v2df)__B); - return (__m128d)__builtin_ia32_sqrtsd ((__v2df)__tmp); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_minpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_minsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_maxpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_maxsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_and_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_andpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_andnot_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_andnpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_or_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_orpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_xor_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_xorpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpeq_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpeqpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmplt_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpltpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmple_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmplepd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpgt_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpgtpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpge_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpgepd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpneq_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpneqpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnlt_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpnltpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnle_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpnlepd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpngt_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpngtpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnge_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpngepd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpord_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpordpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpunord_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpunordpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpeq_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpeqsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmplt_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpltsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmple_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmplesd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpgt_sd (__m128d __A, __m128d __B) -{ - return (__m128d) __builtin_ia32_movsd ((__v2df) __A, - (__v2df) - __builtin_ia32_cmpltsd ((__v2df) __B, - (__v2df) - __A)); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpge_sd (__m128d __A, __m128d __B) -{ - return (__m128d) __builtin_ia32_movsd ((__v2df) __A, - (__v2df) - __builtin_ia32_cmplesd ((__v2df) __B, - (__v2df) - __A)); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpneq_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpneqsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnlt_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpnltsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnle_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpnlesd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpngt_sd (__m128d __A, __m128d __B) -{ - return (__m128d) __builtin_ia32_movsd ((__v2df) __A, - (__v2df) - __builtin_ia32_cmpnltsd ((__v2df) __B, - (__v2df) - __A)); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnge_sd (__m128d __A, __m128d __B) -{ - return (__m128d) __builtin_ia32_movsd ((__v2df) __A, - (__v2df) - __builtin_ia32_cmpnlesd ((__v2df) __B, - (__v2df) - __A)); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpord_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpordsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpunord_sd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_cmpunordsd ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comieq_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_comisdeq ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comilt_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_comisdlt ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comile_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_comisdle ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comigt_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_comisdgt ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comige_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_comisdge ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comineq_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_comisdneq ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomieq_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_ucomisdeq ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomilt_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_ucomisdlt ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomile_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_ucomisdle ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomigt_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_ucomisdgt ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomige_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_ucomisdge ((__v2df)__A, (__v2df)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomineq_sd (__m128d __A, __m128d __B) -{ - return __builtin_ia32_ucomisdneq ((__v2df)__A, (__v2df)__B); -} - -/* Create a vector of Qi, where i is the element number. */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_epi64x (long long __q1, long long __q0) -{ - return __extension__ (__m128i)(__v2di){ __q0, __q1 }; -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_epi64 (__m64 __q1, __m64 __q0) -{ - return _mm_set_epi64x ((long long)__q1, (long long)__q0); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_epi32 (int __q3, int __q2, int __q1, int __q0) -{ - return __extension__ (__m128i)(__v4si){ __q0, __q1, __q2, __q3 }; -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_epi16 (short __q7, short __q6, short __q5, short __q4, - short __q3, short __q2, short __q1, short __q0) -{ - return __extension__ (__m128i)(__v8hi){ - __q0, __q1, __q2, __q3, __q4, __q5, __q6, __q7 }; -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_epi8 (char __q15, char __q14, char __q13, char __q12, - char __q11, char __q10, char __q09, char __q08, - char __q07, char __q06, char __q05, char __q04, - char __q03, char __q02, char __q01, char __q00) -{ - return __extension__ (__m128i)(__v16qi){ - __q00, __q01, __q02, __q03, __q04, __q05, __q06, __q07, - __q08, __q09, __q10, __q11, __q12, __q13, __q14, __q15 - }; -} - -/* Set all of the elements of the vector to A. */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set1_epi64x (long long __A) -{ - return _mm_set_epi64x (__A, __A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set1_epi64 (__m64 __A) -{ - return _mm_set_epi64 (__A, __A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set1_epi32 (int __A) -{ - return _mm_set_epi32 (__A, __A, __A, __A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set1_epi16 (short __A) -{ - return _mm_set_epi16 (__A, __A, __A, __A, __A, __A, __A, __A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set1_epi8 (char __A) -{ - return _mm_set_epi8 (__A, __A, __A, __A, __A, __A, __A, __A, - __A, __A, __A, __A, __A, __A, __A, __A); -} - -/* Create a vector of Qi, where i is the element number. - The parameter order is reversed from the _mm_set_epi* functions. */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setr_epi64 (__m64 __q0, __m64 __q1) -{ - return _mm_set_epi64 (__q1, __q0); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setr_epi32 (int __q0, int __q1, int __q2, int __q3) -{ - return _mm_set_epi32 (__q3, __q2, __q1, __q0); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setr_epi16 (short __q0, short __q1, short __q2, short __q3, - short __q4, short __q5, short __q6, short __q7) -{ - return _mm_set_epi16 (__q7, __q6, __q5, __q4, __q3, __q2, __q1, __q0); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setr_epi8 (char __q00, char __q01, char __q02, char __q03, - char __q04, char __q05, char __q06, char __q07, - char __q08, char __q09, char __q10, char __q11, - char __q12, char __q13, char __q14, char __q15) -{ - return _mm_set_epi8 (__q15, __q14, __q13, __q12, __q11, __q10, __q09, __q08, - __q07, __q06, __q05, __q04, __q03, __q02, __q01, __q00); -} - -/* Create a vector with element 0 as *P and the rest zero. */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_load_si128 (__m128i const *__P) -{ - return *__P; -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_loadu_si128 (__m128i const *__P) -{ - return (__m128i) __builtin_ia32_loaddqu ((char const *)__P); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_loadl_epi64 (__m128i const *__P) -{ - return _mm_set_epi64 ((__m64)0LL, *(__m64 *)__P); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_store_si128 (__m128i *__P, __m128i __B) -{ - *__P = __B; -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_storeu_si128 (__m128i *__P, __m128i __B) -{ - __builtin_ia32_storedqu ((char *)__P, (__v16qi)__B); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_storel_epi64 (__m128i *__P, __m128i __B) -{ - *(long long *)__P = __builtin_ia32_vec_ext_v2di ((__v2di)__B, 0); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_movepi64_pi64 (__m128i __B) -{ - return (__m64) __builtin_ia32_vec_ext_v2di ((__v2di)__B, 0); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_movpi64_epi64 (__m64 __A) -{ - return _mm_set_epi64 ((__m64)0LL, __A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_move_epi64 (__m128i __A) -{ - return (__m128i)__builtin_ia32_movq128 ((__v2di) __A); -} - -/* Create a vector of zeros. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setzero_si128 (void) -{ - return __extension__ (__m128i)(__v4si){ 0, 0, 0, 0 }; -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepi32_pd (__m128i __A) -{ - return (__m128d)__builtin_ia32_cvtdq2pd ((__v4si) __A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepi32_ps (__m128i __A) -{ - return (__m128)__builtin_ia32_cvtdq2ps ((__v4si) __A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtpd_epi32 (__m128d __A) -{ - return (__m128i)__builtin_ia32_cvtpd2dq ((__v2df) __A); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtpd_pi32 (__m128d __A) -{ - return (__m64)__builtin_ia32_cvtpd2pi ((__v2df) __A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtpd_ps (__m128d __A) -{ - return (__m128)__builtin_ia32_cvtpd2ps ((__v2df) __A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvttpd_epi32 (__m128d __A) -{ - return (__m128i)__builtin_ia32_cvttpd2dq ((__v2df) __A); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvttpd_pi32 (__m128d __A) -{ - return (__m64)__builtin_ia32_cvttpd2pi ((__v2df) __A); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtpi32_pd (__m64 __A) -{ - return (__m128d)__builtin_ia32_cvtpi2pd ((__v2si) __A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtps_epi32 (__m128 __A) -{ - return (__m128i)__builtin_ia32_cvtps2dq ((__v4sf) __A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvttps_epi32 (__m128 __A) -{ - return (__m128i)__builtin_ia32_cvttps2dq ((__v4sf) __A); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtps_pd (__m128 __A) -{ - return (__m128d)__builtin_ia32_cvtps2pd ((__v4sf) __A); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsd_si32 (__m128d __A) -{ - return __builtin_ia32_cvtsd2si ((__v2df) __A); -} - -#ifdef __x86_64__ -/* Intel intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsd_si64 (__m128d __A) -{ - return __builtin_ia32_cvtsd2si64 ((__v2df) __A); -} - -/* Microsoft intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsd_si64x (__m128d __A) -{ - return __builtin_ia32_cvtsd2si64 ((__v2df) __A); -} -#endif - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvttsd_si32 (__m128d __A) -{ - return __builtin_ia32_cvttsd2si ((__v2df) __A); -} - -#ifdef __x86_64__ -/* Intel intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvttsd_si64 (__m128d __A) -{ - return __builtin_ia32_cvttsd2si64 ((__v2df) __A); -} - -/* Microsoft intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvttsd_si64x (__m128d __A) -{ - return __builtin_ia32_cvttsd2si64 ((__v2df) __A); -} -#endif - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsd_ss (__m128 __A, __m128d __B) -{ - return (__m128)__builtin_ia32_cvtsd2ss ((__v4sf) __A, (__v2df) __B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi32_sd (__m128d __A, int __B) -{ - return (__m128d)__builtin_ia32_cvtsi2sd ((__v2df) __A, __B); -} - -#ifdef __x86_64__ -/* Intel intrinsic. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi64_sd (__m128d __A, long long __B) -{ - return (__m128d)__builtin_ia32_cvtsi642sd ((__v2df) __A, __B); -} - -/* Microsoft intrinsic. */ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi64x_sd (__m128d __A, long long __B) -{ - return (__m128d)__builtin_ia32_cvtsi642sd ((__v2df) __A, __B); -} -#endif - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtss_sd (__m128d __A, __m128 __B) -{ - return (__m128d)__builtin_ia32_cvtss2sd ((__v2df) __A, (__v4sf)__B); -} - -#ifdef __OPTIMIZE__ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shuffle_pd(__m128d __A, __m128d __B, const int __mask) -{ - return (__m128d)__builtin_ia32_shufpd ((__v2df)__A, (__v2df)__B, __mask); -} -#else -#define _mm_shuffle_pd(A, B, N) \ - ((__m128d)__builtin_ia32_shufpd ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(N))) -#endif - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpackhi_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_unpckhpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpacklo_pd (__m128d __A, __m128d __B) -{ - return (__m128d)__builtin_ia32_unpcklpd ((__v2df)__A, (__v2df)__B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_loadh_pd (__m128d __A, double const *__B) -{ - return (__m128d)__builtin_ia32_loadhpd ((__v2df)__A, __B); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_loadl_pd (__m128d __A, double const *__B) -{ - return (__m128d)__builtin_ia32_loadlpd ((__v2df)__A, __B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_movemask_pd (__m128d __A) -{ - return __builtin_ia32_movmskpd ((__v2df)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_packs_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_packsswb128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_packs_epi32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_packssdw128 ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_packus_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_packuswb128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpackhi_epi8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_punpckhbw128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpackhi_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_punpckhwd128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpackhi_epi32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_punpckhdq128 ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpackhi_epi64 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_punpckhqdq128 ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpacklo_epi8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_punpcklbw128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpacklo_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_punpcklwd128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpacklo_epi32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_punpckldq128 ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpacklo_epi64 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_punpcklqdq128 ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_epi8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_paddb128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_paddw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_epi32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_paddd128 ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_epi64 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_paddq128 ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_adds_epi8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_paddsb128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_adds_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_paddsw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_adds_epu8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_paddusb128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_adds_epu16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_paddusw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_epi8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psubb128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psubw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_epi32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psubd128 ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_epi64 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psubq128 ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_subs_epi8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psubsb128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_subs_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psubsw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_subs_epu8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psubusb128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_subs_epu16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psubusw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_madd_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pmaddwd128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mulhi_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pmulhw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mullo_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pmullw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mul_su32 (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pmuludq ((__v2si)__A, (__v2si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mul_epu32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pmuludq128 ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_slli_epi16 (__m128i __A, int __B) -{ - return (__m128i)__builtin_ia32_psllwi128 ((__v8hi)__A, __B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_slli_epi32 (__m128i __A, int __B) -{ - return (__m128i)__builtin_ia32_pslldi128 ((__v4si)__A, __B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_slli_epi64 (__m128i __A, int __B) -{ - return (__m128i)__builtin_ia32_psllqi128 ((__v2di)__A, __B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srai_epi16 (__m128i __A, int __B) -{ - return (__m128i)__builtin_ia32_psrawi128 ((__v8hi)__A, __B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srai_epi32 (__m128i __A, int __B) -{ - return (__m128i)__builtin_ia32_psradi128 ((__v4si)__A, __B); -} - -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srli_si128 (__m128i __A, const int __N) -{ - return (__m128i)__builtin_ia32_psrldqi128 (__A, __N * 8); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_slli_si128 (__m128i __A, const int __N) -{ - return (__m128i)__builtin_ia32_pslldqi128 (__A, __N * 8); -} -#else -#define _mm_srli_si128(A, N) \ - ((__m128i)__builtin_ia32_psrldqi128 ((__m128i)(A), (int)(N) * 8)) -#define _mm_slli_si128(A, N) \ - ((__m128i)__builtin_ia32_pslldqi128 ((__m128i)(A), (int)(N) * 8)) -#endif - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srli_epi16 (__m128i __A, int __B) -{ - return (__m128i)__builtin_ia32_psrlwi128 ((__v8hi)__A, __B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srli_epi32 (__m128i __A, int __B) -{ - return (__m128i)__builtin_ia32_psrldi128 ((__v4si)__A, __B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srli_epi64 (__m128i __A, int __B) -{ - return (__m128i)__builtin_ia32_psrlqi128 ((__v2di)__A, __B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sll_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psllw128((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sll_epi32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pslld128((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sll_epi64 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psllq128((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sra_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psraw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sra_epi32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psrad128 ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srl_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psrlw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srl_epi32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psrld128 ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srl_epi64 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psrlq128 ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_and_si128 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pand128 ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_andnot_si128 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pandn128 ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_or_si128 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_por128 ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_xor_si128 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pxor128 ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpeq_epi8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pcmpeqb128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpeq_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pcmpeqw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpeq_epi32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pcmpeqd128 ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmplt_epi8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pcmpgtb128 ((__v16qi)__B, (__v16qi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmplt_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pcmpgtw128 ((__v8hi)__B, (__v8hi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmplt_epi32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pcmpgtd128 ((__v4si)__B, (__v4si)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpgt_epi8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pcmpgtb128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpgt_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pcmpgtw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpgt_epi32 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pcmpgtd128 ((__v4si)__A, (__v4si)__B); -} - -#ifdef __OPTIMIZE__ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_extract_epi16 (__m128i const __A, int const __N) -{ - return (unsigned short) __builtin_ia32_vec_ext_v8hi ((__v8hi)__A, __N); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_insert_epi16 (__m128i const __A, int const __D, int const __N) -{ - return (__m128i) __builtin_ia32_vec_set_v8hi ((__v8hi)__A, __D, __N); -} -#else -#define _mm_extract_epi16(A, N) \ - ((int) (unsigned short) __builtin_ia32_vec_ext_v8hi ((__v8hi)(__m128i)(A), (int)(N))) -#define _mm_insert_epi16(A, D, N) \ - ((__m128i) __builtin_ia32_vec_set_v8hi ((__v8hi)(__m128i)(A), \ - (int)(D), (int)(N))) -#endif - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pmaxsw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_epu8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pmaxub128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_epi16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pminsw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_epu8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pminub128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_movemask_epi8 (__m128i __A) -{ - return __builtin_ia32_pmovmskb128 ((__v16qi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mulhi_epu16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pmulhuw128 ((__v8hi)__A, (__v8hi)__B); -} - -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shufflehi_epi16 (__m128i __A, const int __mask) -{ - return (__m128i)__builtin_ia32_pshufhw ((__v8hi)__A, __mask); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shufflelo_epi16 (__m128i __A, const int __mask) -{ - return (__m128i)__builtin_ia32_pshuflw ((__v8hi)__A, __mask); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shuffle_epi32 (__m128i __A, const int __mask) -{ - return (__m128i)__builtin_ia32_pshufd ((__v4si)__A, __mask); -} -#else -#define _mm_shufflehi_epi16(A, N) \ - ((__m128i)__builtin_ia32_pshufhw ((__v8hi)(__m128i)(A), (int)(N))) -#define _mm_shufflelo_epi16(A, N) \ - ((__m128i)__builtin_ia32_pshuflw ((__v8hi)(__m128i)(A), (int)(N))) -#define _mm_shuffle_epi32(A, N) \ - ((__m128i)__builtin_ia32_pshufd ((__v4si)(__m128i)(A), (int)(N))) -#endif - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskmoveu_si128 (__m128i __A, __m128i __B, char *__C) -{ - __builtin_ia32_maskmovdqu ((__v16qi)__A, (__v16qi)__B, __C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_avg_epu8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pavgb128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_avg_epu16 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_pavgw128 ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sad_epu8 (__m128i __A, __m128i __B) -{ - return (__m128i)__builtin_ia32_psadbw128 ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_stream_si32 (int *__A, int __B) -{ - __builtin_ia32_movnti (__A, __B); -} - -#ifdef __x86_64__ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_stream_si64 (long long int *__A, long long int __B) -{ - __builtin_ia32_movnti64 (__A, __B); -} -#endif - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_stream_si128 (__m128i *__A, __m128i __B) -{ - __builtin_ia32_movntdq ((__v2di *)__A, (__v2di)__B); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_stream_pd (double *__A, __m128d __B) -{ - __builtin_ia32_movntpd (__A, (__v2df)__B); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_clflush (void const *__A) -{ - __builtin_ia32_clflush (__A); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_lfence (void) -{ - __builtin_ia32_lfence (); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mfence (void) -{ - __builtin_ia32_mfence (); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi32_si128 (int __A) -{ - return _mm_set_epi32 (0, 0, 0, __A); -} - -#ifdef __x86_64__ -/* Intel intrinsic. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi64_si128 (long long __A) -{ - return _mm_set_epi64x (0, __A); -} - -/* Microsoft intrinsic. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi64x_si128 (long long __A) -{ - return _mm_set_epi64x (0, __A); -} -#endif - -/* Casts between various SP, DP, INT vector types. Note that these do no - conversion of values, they just change the type. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_castpd_ps(__m128d __A) -{ - return (__m128) __A; -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_castpd_si128(__m128d __A) -{ - return (__m128i) __A; -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_castps_pd(__m128 __A) -{ - return (__m128d) __A; -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_castps_si128(__m128 __A) -{ - return (__m128i) __A; -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_castsi128_ps(__m128i __A) -{ - return (__m128) __A; -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_castsi128_pd(__m128i __A) -{ - return (__m128d) __A; -} - -#endif /* __SSE2__ */ - -#endif /* _EMMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/f16cintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/f16cintrin.h deleted file mode 100644 index 4a29fcc..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/f16cintrin.h +++ /dev/null @@ -1,92 +0,0 @@ -/* Copyright (C) 2011-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#if !defined _X86INTRIN_H_INCLUDED && !defined _IMMINTRIN_H_INCLUDED -# error "Never use <f16intrin.h> directly; include <x86intrin.h> or <immintrin.h> instead." -#endif - -#ifndef __F16C__ -# error "F16C instruction set not enabled" -#else - -#ifndef _F16CINTRIN_H_INCLUDED -#define _F16CINTRIN_H_INCLUDED - -extern __inline float __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_cvtsh_ss (unsigned short __S) -{ - __v8hi __H = __extension__ (__v8hi){ (short) __S, 0, 0, 0, 0, 0, 0, 0 }; - __v4sf __A = __builtin_ia32_vcvtph2ps (__H); - return __builtin_ia32_vec_ext_v4sf (__A, 0); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtph_ps (__m128i __A) -{ - return (__m128) __builtin_ia32_vcvtph2ps ((__v8hi) __A); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtph_ps (__m128i __A) -{ - return (__m256) __builtin_ia32_vcvtph2ps256 ((__v8hi) __A); -} - -#ifdef __OPTIMIZE__ -extern __inline unsigned short __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_cvtss_sh (float __F, const int __I) -{ - __v4sf __A = __extension__ (__v4sf){ __F, 0, 0, 0 }; - __v8hi __H = __builtin_ia32_vcvtps2ph (__A, __I); - return (unsigned short) __builtin_ia32_vec_ext_v8hi (__H, 0); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtps_ph (__m128 __A, const int __I) -{ - return (__m128i) __builtin_ia32_vcvtps2ph ((__v4sf) __A, __I); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_cvtps_ph (__m256 __A, const int __I) -{ - return (__m128i) __builtin_ia32_vcvtps2ph256 ((__v8sf) __A, __I); -} -#else -#define _cvtss_sh(__F, __I) \ - (__extension__ \ - ({ \ - __v4sf __A = __extension__ (__v4sf){ __F, 0, 0, 0 }; \ - __v8hi __H = __builtin_ia32_vcvtps2ph (__A, __I); \ - (unsigned short) __builtin_ia32_vec_ext_v8hi (__H, 0); \ - })) - -#define _mm_cvtps_ph(A, I) \ - ((__m128i) __builtin_ia32_vcvtps2ph ((__v4sf)(__m128) A, (int) (I))) - -#define _mm256_cvtps_ph(A, I) \ - ((__m128i) __builtin_ia32_vcvtps2ph256 ((__v8sf)(__m256) A, (int) (I))) -#endif /* __OPTIMIZE */ - -#endif /* _F16CINTRIN_H_INCLUDED */ -#endif /* __F16C__ */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/float.h b/lib/gcc/x86_64-linux-android/4.8/include/float.h deleted file mode 100644 index dd461d7..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/float.h +++ /dev/null @@ -1,277 +0,0 @@ -/* Copyright (C) 2002-2013 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3, or (at your option) -any later version. - -GCC is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. */ - -/* - * ISO C Standard: 5.2.4.2.2 Characteristics of floating types <float.h> - */ - -#ifndef _FLOAT_H___ -#define _FLOAT_H___ - -/* Radix of exponent representation, b. */ -#undef FLT_RADIX -#define FLT_RADIX __FLT_RADIX__ - -/* Number of base-FLT_RADIX digits in the significand, p. */ -#undef FLT_MANT_DIG -#undef DBL_MANT_DIG -#undef LDBL_MANT_DIG -#define FLT_MANT_DIG __FLT_MANT_DIG__ -#define DBL_MANT_DIG __DBL_MANT_DIG__ -#define LDBL_MANT_DIG __LDBL_MANT_DIG__ - -/* Number of decimal digits, q, such that any floating-point number with q - decimal digits can be rounded into a floating-point number with p radix b - digits and back again without change to the q decimal digits, - - p * log10(b) if b is a power of 10 - floor((p - 1) * log10(b)) otherwise -*/ -#undef FLT_DIG -#undef DBL_DIG -#undef LDBL_DIG -#define FLT_DIG __FLT_DIG__ -#define DBL_DIG __DBL_DIG__ -#define LDBL_DIG __LDBL_DIG__ - -/* Minimum int x such that FLT_RADIX**(x-1) is a normalized float, emin */ -#undef FLT_MIN_EXP -#undef DBL_MIN_EXP -#undef LDBL_MIN_EXP -#define FLT_MIN_EXP __FLT_MIN_EXP__ -#define DBL_MIN_EXP __DBL_MIN_EXP__ -#define LDBL_MIN_EXP __LDBL_MIN_EXP__ - -/* Minimum negative integer such that 10 raised to that power is in the - range of normalized floating-point numbers, - - ceil(log10(b) * (emin - 1)) -*/ -#undef FLT_MIN_10_EXP -#undef DBL_MIN_10_EXP -#undef LDBL_MIN_10_EXP -#define FLT_MIN_10_EXP __FLT_MIN_10_EXP__ -#define DBL_MIN_10_EXP __DBL_MIN_10_EXP__ -#define LDBL_MIN_10_EXP __LDBL_MIN_10_EXP__ - -/* Maximum int x such that FLT_RADIX**(x-1) is a representable float, emax. */ -#undef FLT_MAX_EXP -#undef DBL_MAX_EXP -#undef LDBL_MAX_EXP -#define FLT_MAX_EXP __FLT_MAX_EXP__ -#define DBL_MAX_EXP __DBL_MAX_EXP__ -#define LDBL_MAX_EXP __LDBL_MAX_EXP__ - -/* Maximum integer such that 10 raised to that power is in the range of - representable finite floating-point numbers, - - floor(log10((1 - b**-p) * b**emax)) -*/ -#undef FLT_MAX_10_EXP -#undef DBL_MAX_10_EXP -#undef LDBL_MAX_10_EXP -#define FLT_MAX_10_EXP __FLT_MAX_10_EXP__ -#define DBL_MAX_10_EXP __DBL_MAX_10_EXP__ -#define LDBL_MAX_10_EXP __LDBL_MAX_10_EXP__ - -/* Maximum representable finite floating-point number, - - (1 - b**-p) * b**emax -*/ -#undef FLT_MAX -#undef DBL_MAX -#undef LDBL_MAX -#define FLT_MAX __FLT_MAX__ -#define DBL_MAX __DBL_MAX__ -#define LDBL_MAX __LDBL_MAX__ - -/* The difference between 1 and the least value greater than 1 that is - representable in the given floating point type, b**1-p. */ -#undef FLT_EPSILON -#undef DBL_EPSILON -#undef LDBL_EPSILON -#define FLT_EPSILON __FLT_EPSILON__ -#define DBL_EPSILON __DBL_EPSILON__ -#define LDBL_EPSILON __LDBL_EPSILON__ - -/* Minimum normalized positive floating-point number, b**(emin - 1). */ -#undef FLT_MIN -#undef DBL_MIN -#undef LDBL_MIN -#define FLT_MIN __FLT_MIN__ -#define DBL_MIN __DBL_MIN__ -#define LDBL_MIN __LDBL_MIN__ - -/* Addition rounds to 0: zero, 1: nearest, 2: +inf, 3: -inf, -1: unknown. */ -/* ??? This is supposed to change with calls to fesetround in <fenv.h>. */ -#undef FLT_ROUNDS -#define FLT_ROUNDS 1 - -#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L -/* The floating-point expression evaluation method. - -1 indeterminate - 0 evaluate all operations and constants just to the range and - precision of the type - 1 evaluate operations and constants of type float and double - to the range and precision of the double type, evaluate - long double operations and constants to the range and - precision of the long double type - 2 evaluate all operations and constants to the range and - precision of the long double type - - ??? This ought to change with the setting of the fp control word; - the value provided by the compiler assumes the widest setting. */ -#undef FLT_EVAL_METHOD -#define FLT_EVAL_METHOD __FLT_EVAL_METHOD__ - -/* Number of decimal digits, n, such that any floating-point number in the - widest supported floating type with pmax radix b digits can be rounded - to a floating-point number with n decimal digits and back again without - change to the value, - - pmax * log10(b) if b is a power of 10 - ceil(1 + pmax * log10(b)) otherwise -*/ -#undef DECIMAL_DIG -#define DECIMAL_DIG __DECIMAL_DIG__ - -#endif /* C99 */ - -#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L -/* Versions of DECIMAL_DIG for each floating-point type. */ -#undef FLT_DECIMAL_DIG -#undef DBL_DECIMAL_DIG -#undef LDBL_DECIMAL_DIG -#define FLT_DECIMAL_DIG __FLT_DECIMAL_DIG__ -#define DBL_DECIMAL_DIG __DBL_DECIMAL_DIG__ -#define LDBL_DECIMAL_DIG __DECIMAL_DIG__ - -/* Whether types support subnormal numbers. */ -#undef FLT_HAS_SUBNORM -#undef DBL_HAS_SUBNORM -#undef LDBL_HAS_SUBNORM -#define FLT_HAS_SUBNORM __FLT_HAS_DENORM__ -#define DBL_HAS_SUBNORM __DBL_HAS_DENORM__ -#define LDBL_HAS_SUBNORM __LDBL_HAS_DENORM__ - -/* Minimum positive values, including subnormals. */ -#undef FLT_TRUE_MIN -#undef DBL_TRUE_MIN -#undef LDBL_TRUE_MIN -#if __FLT_HAS_DENORM__ -#define FLT_TRUE_MIN __FLT_DENORM_MIN__ -#else -#define FLT_TRUE_MIN __FLT_MIN__ -#endif -#if __DBL_HAS_DENORM__ -#define DBL_TRUE_MIN __DBL_DENORM_MIN__ -#else -#define DBL_TRUE_MIN __DBL_MIN__ -#endif -#if __LDBL_HAS_DENORM__ -#define LDBL_TRUE_MIN __LDBL_DENORM_MIN__ -#else -#define LDBL_TRUE_MIN __LDBL_MIN__ -#endif - -#endif /* C11 */ - -#ifdef __STDC_WANT_DEC_FP__ -/* Draft Technical Report 24732, extension for decimal floating-point - arithmetic: Characteristic of decimal floating types <float.h>. */ - -/* Number of base-FLT_RADIX digits in the significand, p. */ -#undef DEC32_MANT_DIG -#undef DEC64_MANT_DIG -#undef DEC128_MANT_DIG -#define DEC32_MANT_DIG __DEC32_MANT_DIG__ -#define DEC64_MANT_DIG __DEC64_MANT_DIG__ -#define DEC128_MANT_DIG __DEC128_MANT_DIG__ - -/* Minimum exponent. */ -#undef DEC32_MIN_EXP -#undef DEC64_MIN_EXP -#undef DEC128_MIN_EXP -#define DEC32_MIN_EXP __DEC32_MIN_EXP__ -#define DEC64_MIN_EXP __DEC64_MIN_EXP__ -#define DEC128_MIN_EXP __DEC128_MIN_EXP__ - -/* Maximum exponent. */ -#undef DEC32_MAX_EXP -#undef DEC64_MAX_EXP -#undef DEC128_MAX_EXP -#define DEC32_MAX_EXP __DEC32_MAX_EXP__ -#define DEC64_MAX_EXP __DEC64_MAX_EXP__ -#define DEC128_MAX_EXP __DEC128_MAX_EXP__ - -/* Maximum representable finite decimal floating-point number - (there are 6, 15, and 33 9s after the decimal points respectively). */ -#undef DEC32_MAX -#undef DEC64_MAX -#undef DEC128_MAX -#define DEC32_MAX __DEC32_MAX__ -#define DEC64_MAX __DEC64_MAX__ -#define DEC128_MAX __DEC128_MAX__ - -/* The difference between 1 and the least value greater than 1 that is - representable in the given floating point type. */ -#undef DEC32_EPSILON -#undef DEC64_EPSILON -#undef DEC128_EPSILON -#define DEC32_EPSILON __DEC32_EPSILON__ -#define DEC64_EPSILON __DEC64_EPSILON__ -#define DEC128_EPSILON __DEC128_EPSILON__ - -/* Minimum normalized positive floating-point number. */ -#undef DEC32_MIN -#undef DEC64_MIN -#undef DEC128_MIN -#define DEC32_MIN __DEC32_MIN__ -#define DEC64_MIN __DEC64_MIN__ -#define DEC128_MIN __DEC128_MIN__ - -/* Minimum subnormal positive floating-point number. */ -#undef DEC32_SUBNORMAL_MIN -#undef DEC64_SUBNORMAL_MIN -#undef DEC128_SUBNORMAL_MIN -#define DEC32_SUBNORMAL_MIN __DEC32_SUBNORMAL_MIN__ -#define DEC64_SUBNORMAL_MIN __DEC64_SUBNORMAL_MIN__ -#define DEC128_SUBNORMAL_MIN __DEC128_SUBNORMAL_MIN__ - -/* The floating-point expression evaluation method. - -1 indeterminate - 0 evaluate all operations and constants just to the range and - precision of the type - 1 evaluate operations and constants of type _Decimal32 - and _Decimal64 to the range and precision of the _Decimal64 - type, evaluate _Decimal128 operations and constants to the - range and precision of the _Decimal128 type; - 2 evaluate all operations and constants to the range and - precision of the _Decimal128 type. */ - -#undef DEC_EVAL_METHOD -#define DEC_EVAL_METHOD __DEC_EVAL_METHOD__ - -#endif /* __STDC_WANT_DEC_FP__ */ - -#endif /* _FLOAT_H___ */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/fma4intrin.h b/lib/gcc/x86_64-linux-android/4.8/include/fma4intrin.h deleted file mode 100644 index 00ba781..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/fma4intrin.h +++ /dev/null @@ -1,236 +0,0 @@ -/* Copyright (C) 2007-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _X86INTRIN_H_INCLUDED -# error "Never use <fma4intrin.h> directly; include <x86intrin.h> instead." -#endif - -#ifndef _FMA4INTRIN_H_INCLUDED -#define _FMA4INTRIN_H_INCLUDED - -#ifndef __FMA4__ -# error "FMA4 instruction set not enabled" -#else - -/* We need definitions from the SSE4A, SSE3, SSE2 and SSE header files. */ -#include <ammintrin.h> - -/* 128b Floating point multiply/add type instructions. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_macc_ps (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128) __builtin_ia32_vfmaddps ((__v4sf)__A, (__v4sf)__B, (__v4sf)__C); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_macc_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d) __builtin_ia32_vfmaddpd ((__v2df)__A, (__v2df)__B, (__v2df)__C); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_macc_ss (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128) __builtin_ia32_vfmaddss ((__v4sf)__A, (__v4sf)__B, (__v4sf)__C); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_macc_sd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d) __builtin_ia32_vfmaddsd ((__v2df)__A, (__v2df)__B, (__v2df)__C); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_msub_ps (__m128 __A, __m128 __B, __m128 __C) - -{ - return (__m128) __builtin_ia32_vfmaddps ((__v4sf)__A, (__v4sf)__B, -(__v4sf)__C); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_msub_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d) __builtin_ia32_vfmaddpd ((__v2df)__A, (__v2df)__B, -(__v2df)__C); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_msub_ss (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128) __builtin_ia32_vfmaddss ((__v4sf)__A, (__v4sf)__B, -(__v4sf)__C); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_msub_sd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d) __builtin_ia32_vfmaddsd ((__v2df)__A, (__v2df)__B, -(__v2df)__C); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_nmacc_ps (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128) __builtin_ia32_vfmaddps (-(__v4sf)__A, (__v4sf)__B, (__v4sf)__C); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_nmacc_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d) __builtin_ia32_vfmaddpd (-(__v2df)__A, (__v2df)__B, (__v2df)__C); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_nmacc_ss (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128) __builtin_ia32_vfmaddss (-(__v4sf)__A, (__v4sf)__B, (__v4sf)__C); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_nmacc_sd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d) __builtin_ia32_vfmaddsd (-(__v2df)__A, (__v2df)__B, (__v2df)__C); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_nmsub_ps (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128) __builtin_ia32_vfmaddps (-(__v4sf)__A, (__v4sf)__B, -(__v4sf)__C); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_nmsub_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d) __builtin_ia32_vfmaddpd (-(__v2df)__A, (__v2df)__B, -(__v2df)__C); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_nmsub_ss (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128) __builtin_ia32_vfmaddss (-(__v4sf)__A, (__v4sf)__B, -(__v4sf)__C); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_nmsub_sd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d) __builtin_ia32_vfmaddsd (-(__v2df)__A, (__v2df)__B, -(__v2df)__C); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maddsub_ps (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128) __builtin_ia32_vfmaddsubps ((__v4sf)__A, (__v4sf)__B, (__v4sf)__C); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maddsub_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d) __builtin_ia32_vfmaddsubpd ((__v2df)__A, (__v2df)__B, (__v2df)__C); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_msubadd_ps (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128) __builtin_ia32_vfmaddsubps ((__v4sf)__A, (__v4sf)__B, -(__v4sf)__C); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_msubadd_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d) __builtin_ia32_vfmaddsubpd ((__v2df)__A, (__v2df)__B, -(__v2df)__C); -} - -/* 256b Floating point multiply/add type instructions. */ -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_macc_ps (__m256 __A, __m256 __B, __m256 __C) -{ - return (__m256) __builtin_ia32_vfmaddps256 ((__v8sf)__A, (__v8sf)__B, (__v8sf)__C); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_macc_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d) __builtin_ia32_vfmaddpd256 ((__v4df)__A, (__v4df)__B, (__v4df)__C); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_msub_ps (__m256 __A, __m256 __B, __m256 __C) - -{ - return (__m256) __builtin_ia32_vfmaddps256 ((__v8sf)__A, (__v8sf)__B, -(__v8sf)__C); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_msub_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d) __builtin_ia32_vfmaddpd256 ((__v4df)__A, (__v4df)__B, -(__v4df)__C); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_nmacc_ps (__m256 __A, __m256 __B, __m256 __C) -{ - return (__m256) __builtin_ia32_vfmaddps256 (-(__v8sf)__A, (__v8sf)__B, (__v8sf)__C); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_nmacc_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d) __builtin_ia32_vfmaddpd256 (-(__v4df)__A, (__v4df)__B, (__v4df)__C); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_nmsub_ps (__m256 __A, __m256 __B, __m256 __C) -{ - return (__m256) __builtin_ia32_vfmaddps256 (-(__v8sf)__A, (__v8sf)__B, -(__v8sf)__C); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_nmsub_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d) __builtin_ia32_vfmaddpd256 (-(__v4df)__A, (__v4df)__B, -(__v4df)__C); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_maddsub_ps (__m256 __A, __m256 __B, __m256 __C) -{ - return (__m256) __builtin_ia32_vfmaddsubps256 ((__v8sf)__A, (__v8sf)__B, (__v8sf)__C); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_maddsub_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d) __builtin_ia32_vfmaddsubpd256 ((__v4df)__A, (__v4df)__B, (__v4df)__C); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_msubadd_ps (__m256 __A, __m256 __B, __m256 __C) -{ - return (__m256) __builtin_ia32_vfmaddsubps256 ((__v8sf)__A, (__v8sf)__B, -(__v8sf)__C); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_msubadd_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d) __builtin_ia32_vfmaddsubpd256 ((__v4df)__A, (__v4df)__B, -(__v4df)__C); -} - -#endif - -#endif diff --git a/lib/gcc/x86_64-linux-android/4.8/include/fmaintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/fmaintrin.h deleted file mode 100644 index 6ede84b..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/fmaintrin.h +++ /dev/null @@ -1,297 +0,0 @@ -/* Copyright (C) 2011-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _IMMINTRIN_H_INCLUDED -# error "Never use <fmaintrin.h> directly; include <immintrin.h> instead." -#endif - -#ifndef _FMAINTRIN_H_INCLUDED -#define _FMAINTRIN_H_INCLUDED - -#ifndef __FMA__ -# error "FMA instruction set not enabled" -#else - -extern __inline __m128d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmadd_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d)__builtin_ia32_vfmaddpd ((__v2df)__A, (__v2df)__B, - (__v2df)__C); -} - -extern __inline __m256d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fmadd_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d)__builtin_ia32_vfmaddpd256 ((__v4df)__A, (__v4df)__B, - (__v4df)__C); -} - -extern __inline __m128 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmadd_ps (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128)__builtin_ia32_vfmaddps ((__v4sf)__A, (__v4sf)__B, - (__v4sf)__C); -} - -extern __inline __m256 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fmadd_ps (__m256 __A, __m256 __B, __m256 __C) -{ - return (__m256)__builtin_ia32_vfmaddps256 ((__v8sf)__A, (__v8sf)__B, - (__v8sf)__C); -} - -extern __inline __m128d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmadd_sd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d) __builtin_ia32_vfmaddsd3 ((__v2df)__A, (__v2df)__B, - (__v2df)__C); -} - -extern __inline __m128 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmadd_ss (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128) __builtin_ia32_vfmaddss3 ((__v4sf)__A, (__v4sf)__B, - (__v4sf)__C); -} - -extern __inline __m128d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmsub_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d)__builtin_ia32_vfmaddpd ((__v2df)__A, (__v2df)__B, - -(__v2df)__C); -} - -extern __inline __m256d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fmsub_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d)__builtin_ia32_vfmaddpd256 ((__v4df)__A, (__v4df)__B, - -(__v4df)__C); -} - -extern __inline __m128 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmsub_ps (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128)__builtin_ia32_vfmaddps ((__v4sf)__A, (__v4sf)__B, - -(__v4sf)__C); -} - -extern __inline __m256 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fmsub_ps (__m256 __A, __m256 __B, __m256 __C) -{ - return (__m256)__builtin_ia32_vfmaddps256 ((__v8sf)__A, (__v8sf)__B, - -(__v8sf)__C); -} - -extern __inline __m128d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmsub_sd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d)__builtin_ia32_vfmaddsd3 ((__v2df)__A, (__v2df)__B, - -(__v2df)__C); -} - -extern __inline __m128 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmsub_ss (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128)__builtin_ia32_vfmaddss3 ((__v4sf)__A, (__v4sf)__B, - -(__v4sf)__C); -} - -extern __inline __m128d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fnmadd_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d)__builtin_ia32_vfmaddpd (-(__v2df)__A, (__v2df)__B, - (__v2df)__C); -} - -extern __inline __m256d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fnmadd_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d)__builtin_ia32_vfmaddpd256 (-(__v4df)__A, (__v4df)__B, - (__v4df)__C); -} - -extern __inline __m128 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fnmadd_ps (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128)__builtin_ia32_vfmaddps (-(__v4sf)__A, (__v4sf)__B, - (__v4sf)__C); -} - -extern __inline __m256 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fnmadd_ps (__m256 __A, __m256 __B, __m256 __C) -{ - return (__m256)__builtin_ia32_vfmaddps256 (-(__v8sf)__A, (__v8sf)__B, - (__v8sf)__C); -} - -extern __inline __m128d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fnmadd_sd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d)__builtin_ia32_vfmaddsd3 ((__v2df)__A, -(__v2df)__B, - (__v2df)__C); -} - -extern __inline __m128 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fnmadd_ss (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128)__builtin_ia32_vfmaddss3 ((__v4sf)__A, -(__v4sf)__B, - (__v4sf)__C); -} - -extern __inline __m128d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fnmsub_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d)__builtin_ia32_vfmaddpd (-(__v2df)__A, (__v2df)__B, - -(__v2df)__C); -} - -extern __inline __m256d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fnmsub_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d)__builtin_ia32_vfmaddpd256 (-(__v4df)__A, (__v4df)__B, - -(__v4df)__C); -} - -extern __inline __m128 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fnmsub_ps (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128)__builtin_ia32_vfmaddps (-(__v4sf)__A, (__v4sf)__B, - -(__v4sf)__C); -} - -extern __inline __m256 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fnmsub_ps (__m256 __A, __m256 __B, __m256 __C) -{ - return (__m256)__builtin_ia32_vfmaddps256 (-(__v8sf)__A, (__v8sf)__B, - -(__v8sf)__C); -} - -extern __inline __m128d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fnmsub_sd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d)__builtin_ia32_vfmaddsd3 ((__v2df)__A, -(__v2df)__B, - -(__v2df)__C); -} - -extern __inline __m128 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fnmsub_ss (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128)__builtin_ia32_vfmaddss3 ((__v4sf)__A, -(__v4sf)__B, - -(__v4sf)__C); -} - -extern __inline __m128d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmaddsub_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d)__builtin_ia32_vfmaddsubpd ((__v2df)__A, (__v2df)__B, - (__v2df)__C); -} - -extern __inline __m256d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fmaddsub_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d)__builtin_ia32_vfmaddsubpd256 ((__v4df)__A, - (__v4df)__B, - (__v4df)__C); -} - -extern __inline __m128 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmaddsub_ps (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128)__builtin_ia32_vfmaddsubps ((__v4sf)__A, (__v4sf)__B, - (__v4sf)__C); -} - -extern __inline __m256 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fmaddsub_ps (__m256 __A, __m256 __B, __m256 __C) -{ - return (__m256)__builtin_ia32_vfmaddsubps256 ((__v8sf)__A, - (__v8sf)__B, - (__v8sf)__C); -} - -extern __inline __m128d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmsubadd_pd (__m128d __A, __m128d __B, __m128d __C) -{ - return (__m128d)__builtin_ia32_vfmaddsubpd ((__v2df)__A, (__v2df)__B, - -(__v2df)__C); -} - -extern __inline __m256d -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fmsubadd_pd (__m256d __A, __m256d __B, __m256d __C) -{ - return (__m256d)__builtin_ia32_vfmaddsubpd256 ((__v4df)__A, - (__v4df)__B, - -(__v4df)__C); -} - -extern __inline __m128 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fmsubadd_ps (__m128 __A, __m128 __B, __m128 __C) -{ - return (__m128)__builtin_ia32_vfmaddsubps ((__v4sf)__A, (__v4sf)__B, - -(__v4sf)__C); -} - -extern __inline __m256 -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_fmsubadd_ps (__m256 __A, __m256 __B, __m256 __C) -{ - return (__m256)__builtin_ia32_vfmaddsubps256 ((__v8sf)__A, - (__v8sf)__B, - -(__v8sf)__C); -} - -#endif - -#endif diff --git a/lib/gcc/x86_64-linux-android/4.8/include/fxsrintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/fxsrintrin.h deleted file mode 100644 index 9b63222..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/fxsrintrin.h +++ /dev/null @@ -1,61 +0,0 @@ -/* Copyright (C) 2012-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* #if !defined _X86INTRIN_H_INCLUDED && !defined _IMMINTRIN_H_INCLUDED */ -/* # error "Never use <fxsrintrin.h> directly; include <x86intrin.h> instead." */ -/* #endif */ - -#ifndef _FXSRINTRIN_H_INCLUDED -#define _FXSRINTRIN_H_INCLUDED - -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_fxsave (void *__P) -{ - return __builtin_ia32_fxsave (__P); -} - -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_fxrstor (void *__P) -{ - return __builtin_ia32_fxrstor (__P); -} - -#ifdef __x86_64__ -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_fxsave64 (void *__P) -{ - return __builtin_ia32_fxsave64 (__P); -} - -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_fxrstor64 (void *__P) -{ - return __builtin_ia32_fxrstor64 (__P); -} -#endif - -#endif /* _FXSRINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/ia32intrin.h b/lib/gcc/x86_64-linux-android/4.8/include/ia32intrin.h deleted file mode 100644 index 131af0b..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/ia32intrin.h +++ /dev/null @@ -1,242 +0,0 @@ -/* Copyright (C) 2009-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _X86INTRIN_H_INCLUDED -# error "Never use <ia32intrin.h> directly; include <x86intrin.h> instead." -#endif - -/* 32bit bsf */ -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__bsfd (int __X) -{ - return __builtin_ctz (__X); -} - -/* 32bit bsr */ -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__bsrd (int __X) -{ - return __builtin_ia32_bsrsi (__X); -} - -/* 32bit bswap */ -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__bswapd (int __X) -{ - return __builtin_bswap32 (__X); -} - -#ifdef __SSE4_2__ -/* 32bit accumulate CRC32 (polynomial 0x11EDC6F41) value. */ -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__crc32b (unsigned int __C, unsigned char __V) -{ - return __builtin_ia32_crc32qi (__C, __V); -} - -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__crc32w (unsigned int __C, unsigned short __V) -{ - return __builtin_ia32_crc32hi (__C, __V); -} - -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__crc32d (unsigned int __C, unsigned int __V) -{ - return __builtin_ia32_crc32si (__C, __V); -} -#endif /* SSE4.2 */ - -/* 32bit popcnt */ -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__popcntd (unsigned int __X) -{ - return __builtin_popcount (__X); -} - -/* rdpmc */ -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__rdpmc (int __S) -{ - return __builtin_ia32_rdpmc (__S); -} - -/* rdtsc */ -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__rdtsc (void) -{ - return __builtin_ia32_rdtsc (); -} - -/* rdtscp */ -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__rdtscp (unsigned int *__A) -{ - return __builtin_ia32_rdtscp (__A); -} - -/* 8bit rol */ -extern __inline unsigned char -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__rolb (unsigned char __X, int __C) -{ - return __builtin_ia32_rolqi (__X, __C); -} - -/* 16bit rol */ -extern __inline unsigned short -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__rolw (unsigned short __X, int __C) -{ - return __builtin_ia32_rolhi (__X, __C); -} - -/* 32bit rol */ -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__rold (unsigned int __X, int __C) -{ - return (__X << __C) | (__X >> (32 - __C)); -} - -/* 8bit ror */ -extern __inline unsigned char -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__rorb (unsigned char __X, int __C) -{ - return __builtin_ia32_rorqi (__X, __C); -} - -/* 16bit ror */ -extern __inline unsigned short -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__rorw (unsigned short __X, int __C) -{ - return __builtin_ia32_rorhi (__X, __C); -} - -/* 32bit ror */ -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__rord (unsigned int __X, int __C) -{ - return (__X >> __C) | (__X << (32 - __C)); -} - -/* Pause */ -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__pause (void) -{ - __builtin_ia32_pause (); -} - -#ifdef __x86_64__ -/* 64bit bsf */ -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__bsfq (long long __X) -{ - return __builtin_ctzll (__X); -} - -/* 64bit bsr */ -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__bsrq (long long __X) -{ - return __builtin_ia32_bsrdi (__X); -} - -/* 64bit bswap */ -extern __inline long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__bswapq (long long __X) -{ - return __builtin_bswap64 (__X); -} - -#ifdef __SSE4_2__ -/* 64bit accumulate CRC32 (polynomial 0x11EDC6F41) value. */ -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__crc32q (unsigned long long __C, unsigned long long __V) -{ - return __builtin_ia32_crc32di (__C, __V); -} -#endif - -/* 64bit popcnt */ -extern __inline long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__popcntq (unsigned long long __X) -{ - return __builtin_popcountll (__X); -} - -/* 64bit rol */ -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__rolq (unsigned long long __X, int __C) -{ - return (__X << __C) | (__X >> (64 - __C)); -} - -/* 64bit ror */ -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__rorq (unsigned long long __X, int __C) -{ - return (__X >> __C) | (__X << (64 - __C)); -} - -#define _bswap64(a) __bswapq(a) -#define _popcnt64(a) __popcntq(a) -#define _lrotl(a,b) __rolq((a), (b)) -#define _lrotr(a,b) __rorq((a), (b)) -#else -#define _lrotl(a,b) __rold((a), (b)) -#define _lrotr(a,b) __rord((a), (b)) -#endif - -#define _bit_scan_forward(a) __bsfd(a) -#define _bit_scan_reverse(a) __bsrd(a) -#define _bswap(a) __bswapd(a) -#define _popcnt32(a) __popcntd(a) -#define _rdpmc(a) __rdpmc(a) -#define _rdtsc() __rdtsc() -#define _rdtscp(a) __rdtscp(a) -#define _rotwl(a,b) __rolw((a), (b)) -#define _rotwr(a,b) __rorw((a), (b)) -#define _rotl(a,b) __rold((a), (b)) -#define _rotr(a,b) __rord((a), (b)) diff --git a/lib/gcc/x86_64-linux-android/4.8/include/immintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/immintrin.h deleted file mode 100644 index b137753..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/immintrin.h +++ /dev/null @@ -1,176 +0,0 @@ -/* Copyright (C) 2008-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _IMMINTRIN_H_INCLUDED -#define _IMMINTRIN_H_INCLUDED - -#ifdef __MMX__ -#include <mmintrin.h> -#endif - -#ifdef __SSE__ -#include <xmmintrin.h> -#endif - -#ifdef __SSE2__ -#include <emmintrin.h> -#endif - -#ifdef __SSE3__ -#include <pmmintrin.h> -#endif - -#ifdef __SSSE3__ -#include <tmmintrin.h> -#endif - -#if defined (__SSE4_2__) || defined (__SSE4_1__) -#include <smmintrin.h> -#endif - -#if defined (__AES__) || defined (__PCLMUL__) -#include <wmmintrin.h> -#endif - -#ifdef __AVX__ -#include <avxintrin.h> -#endif - -#ifdef __AVX2__ -#include <avx2intrin.h> -#endif - -#ifdef __LZCNT__ -#include <lzcntintrin.h> -#endif - -#ifdef __BMI__ -#include <bmiintrin.h> -#endif - -#ifdef __BMI2__ -#include <bmi2intrin.h> -#endif - -#ifdef __FMA__ -#include <fmaintrin.h> -#endif - -#ifdef __F16C__ -#include <f16cintrin.h> -#endif - -#ifdef __RTM__ -#include <rtmintrin.h> -#endif - -#ifdef __RTM__ -#include <xtestintrin.h> -#endif - -#ifdef __RDRND__ -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_rdrand16_step (unsigned short *__P) -{ - return __builtin_ia32_rdrand16_step (__P); -} - -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_rdrand32_step (unsigned int *__P) -{ - return __builtin_ia32_rdrand32_step (__P); -} -#endif /* __RDRND__ */ - -#ifdef __x86_64__ -#ifdef __FSGSBASE__ -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_readfsbase_u32 (void) -{ - return __builtin_ia32_rdfsbase32 (); -} - -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_readfsbase_u64 (void) -{ - return __builtin_ia32_rdfsbase64 (); -} - -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_readgsbase_u32 (void) -{ - return __builtin_ia32_rdgsbase32 (); -} - -extern __inline unsigned long long -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_readgsbase_u64 (void) -{ - return __builtin_ia32_rdgsbase64 (); -} - -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_writefsbase_u32 (unsigned int __B) -{ - __builtin_ia32_wrfsbase32 (__B); -} - -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_writefsbase_u64 (unsigned long long __B) -{ - __builtin_ia32_wrfsbase64 (__B); -} - -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_writegsbase_u32 (unsigned int __B) -{ - __builtin_ia32_wrgsbase32 (__B); -} - -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_writegsbase_u64 (unsigned long long __B) -{ - __builtin_ia32_wrgsbase64 (__B); -} -#endif /* __FSGSBASE__ */ - -#ifdef __RDRND__ -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_rdrand64_step (unsigned long long *__P) -{ - return __builtin_ia32_rdrand64_step (__P); -} -#endif /* __RDRND__ */ -#endif /* __x86_64__ */ - -#endif /* _IMMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/iso646.h b/lib/gcc/x86_64-linux-android/4.8/include/iso646.h deleted file mode 100644 index 36dec91..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/iso646.h +++ /dev/null @@ -1,45 +0,0 @@ -/* Copyright (C) 1997-2013 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3, or (at your option) -any later version. - -GCC is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. */ - -/* - * ISO C Standard: 7.9 Alternative spellings <iso646.h> - */ - -#ifndef _ISO646_H -#define _ISO646_H - -#ifndef __cplusplus -#define and && -#define and_eq &= -#define bitand & -#define bitor | -#define compl ~ -#define not ! -#define not_eq != -#define or || -#define or_eq |= -#define xor ^ -#define xor_eq ^= -#endif - -#endif diff --git a/lib/gcc/x86_64-linux-android/4.8/include/lwpintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/lwpintrin.h deleted file mode 100644 index 8c70850..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/lwpintrin.h +++ /dev/null @@ -1,100 +0,0 @@ -/* Copyright (C) 2007-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _X86INTRIN_H_INCLUDED -# error "Never use <lwpintrin.h> directly; include <x86intrin.h> instead." -#endif - -#ifndef _LWPINTRIN_H_INCLUDED -#define _LWPINTRIN_H_INCLUDED - -#ifndef __LWP__ -# error "LWP instruction set not enabled" -#else - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__llwpcb (void *pcbAddress) -{ - __builtin_ia32_llwpcb (pcbAddress); -} - -extern __inline void * __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__slwpcb (void) -{ - return __builtin_ia32_slwpcb (); -} - -#ifdef __OPTIMIZE__ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__lwpval32 (unsigned int data2, unsigned int data1, unsigned int flags) -{ - __builtin_ia32_lwpval32 (data2, data1, flags); -} - -#ifdef __x86_64__ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__lwpval64 (unsigned long long data2, unsigned int data1, unsigned int flags) -{ - __builtin_ia32_lwpval64 (data2, data1, flags); -} -#endif -#else -#define __lwpval32(D2, D1, F) \ - (__builtin_ia32_lwpval32 ((unsigned int) (D2), (unsigned int) (D1), \ - (unsigned int) (F))) -#ifdef __x86_64__ -#define __lwpval64(D2, D1, F) \ - (__builtin_ia32_lwpval64 ((unsigned long long) (D2), (unsigned int) (D1), \ - (unsigned int) (F))) -#endif -#endif - - -#ifdef __OPTIMIZE__ -extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__lwpins32 (unsigned int data2, unsigned int data1, unsigned int flags) -{ - return __builtin_ia32_lwpins32 (data2, data1, flags); -} - -#ifdef __x86_64__ -extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__lwpins64 (unsigned long long data2, unsigned int data1, unsigned int flags) -{ - return __builtin_ia32_lwpins64 (data2, data1, flags); -} -#endif -#else -#define __lwpins32(D2, D1, F) \ - (__builtin_ia32_lwpins32 ((unsigned int) (D2), (unsigned int) (D1), \ - (unsigned int) (F))) -#ifdef __x86_64__ -#define __lwpins64(D2, D1, F) \ - (__builtin_ia32_lwpins64 ((unsigned long long) (D2), (unsigned int) (D1), \ - (unsigned int) (F))) -#endif -#endif - -#endif /* __LWP__ */ - -#endif /* _LWPINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/lzcntintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/lzcntintrin.h deleted file mode 100644 index 9382bb9..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/lzcntintrin.h +++ /dev/null @@ -1,67 +0,0 @@ -/* Copyright (C) 2009-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#if !defined _X86INTRIN_H_INCLUDED && !defined _IMMINTRIN_H_INCLUDED -# error "Never use <lzcntintrin.h> directly; include <x86intrin.h> instead." -#endif - -#ifndef __LZCNT__ -# error "LZCNT instruction is not enabled" -#endif /* __LZCNT__ */ - -#ifndef _LZCNTINTRIN_H_INCLUDED -#define _LZCNTINTRIN_H_INCLUDED - -extern __inline unsigned short __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__lzcnt16 (unsigned short __X) -{ - return __builtin_clzs (__X); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__lzcnt32 (unsigned int __X) -{ - return __builtin_clz (__X); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_lzcnt_u32 (unsigned int __X) -{ - return __builtin_clz (__X); -} - -#ifdef __x86_64__ -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__lzcnt64 (unsigned long long __X) -{ - return __builtin_clzll (__X); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_lzcnt_u64 (unsigned long long __X) -{ - return __builtin_clzll (__X); -} -#endif - -#endif /* _LZCNTINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/mm3dnow.h b/lib/gcc/x86_64-linux-android/4.8/include/mm3dnow.h deleted file mode 100644 index 7e806b7..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/mm3dnow.h +++ /dev/null @@ -1,210 +0,0 @@ -/* Copyright (C) 2004-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* Implemented from the mm3dnow.h (of supposedly AMD origin) included with - MSVC 7.1. */ - -#ifndef _MM3DNOW_H_INCLUDED -#define _MM3DNOW_H_INCLUDED - -#ifdef __3dNOW__ - -#include <mmintrin.h> -#include <prfchwintrin.h> - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_femms (void) -{ - __builtin_ia32_femms(); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pavgusb (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pavgusb ((__v8qi)__A, (__v8qi)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pf2id (__m64 __A) -{ - return (__m64)__builtin_ia32_pf2id ((__v2sf)__A); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfacc (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfacc ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfadd (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfadd ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfcmpeq (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfcmpeq ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfcmpge (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfcmpge ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfcmpgt (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfcmpgt ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfmax (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfmax ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfmin (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfmin ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfmul (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfmul ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfrcp (__m64 __A) -{ - return (__m64)__builtin_ia32_pfrcp ((__v2sf)__A); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfrcpit1 (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfrcpit1 ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfrcpit2 (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfrcpit2 ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfrsqrt (__m64 __A) -{ - return (__m64)__builtin_ia32_pfrsqrt ((__v2sf)__A); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfrsqit1 (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfrsqit1 ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfsub (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfsub ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfsubr (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfsubr ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pi2fd (__m64 __A) -{ - return (__m64)__builtin_ia32_pi2fd ((__v2si)__A); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pmulhrw (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pmulhrw ((__v4hi)__A, (__v4hi)__B); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_prefetch (void *__P) -{ - __builtin_prefetch (__P, 0, 3 /* _MM_HINT_T0 */); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_from_float (float __A) -{ - return __extension__ (__m64)(__v2sf){ __A, 0.0f }; -} - -extern __inline float __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_to_float (__m64 __A) -{ - union { __v2sf v; float a[2]; } __tmp; - __tmp.v = (__v2sf)__A; - return __tmp.a[0]; -} - -#ifdef __3dNOW_A__ - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pf2iw (__m64 __A) -{ - return (__m64)__builtin_ia32_pf2iw ((__v2sf)__A); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfnacc (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfnacc ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pfpnacc (__m64 __A, __m64 __B) -{ - return (__m64)__builtin_ia32_pfpnacc ((__v2sf)__A, (__v2sf)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pi2fw (__m64 __A) -{ - return (__m64)__builtin_ia32_pi2fw ((__v2si)__A); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pswapd (__m64 __A) -{ - return (__m64)__builtin_ia32_pswapdsf ((__v2sf)__A); -} - -#endif /* __3dNOW_A__ */ -#endif /* __3dNOW__ */ - -#endif /* _MM3DNOW_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/mm_malloc.h b/lib/gcc/x86_64-linux-android/4.8/include/mm_malloc.h deleted file mode 100644 index ee2d1a0..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/mm_malloc.h +++ /dev/null @@ -1,63 +0,0 @@ -/* Copyright (C) 2004-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _MM_MALLOC_H_INCLUDED -#define _MM_MALLOC_H_INCLUDED - -#include <stdlib.h> - -#if !defined(__ANDROID__) || defined(HAVE_POSIX_MEMALIGN) -/* We can't depend on <stdlib.h> since the prototype of posix_memalign - may not be visible. */ -#ifndef __cplusplus -extern int posix_memalign (void **, size_t, size_t); -#else -extern "C" int posix_memalign (void **, size_t, size_t) throw (); -#endif -#endif - -static __inline void * -_mm_malloc (size_t size, size_t alignment) -{ - void *ptr; - if (alignment == 1) - return malloc (size); - if (alignment == 2 || (sizeof (void *) == 8 && alignment == 4)) - alignment = sizeof (void *); -#if !defined(__ANDROID__) || defined(HAVE_POSIX_MEMALIGN) - if (posix_memalign (&ptr, alignment, size) == 0) - return ptr; - else - return NULL; -#else - return memalign(alignment, size); -#endif -} - -static __inline void -_mm_free (void * ptr) -{ - free (ptr); -} - -#endif /* _MM_MALLOC_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/mmintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/mmintrin.h deleted file mode 100644 index c76203b..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/mmintrin.h +++ /dev/null @@ -1,920 +0,0 @@ -/* Copyright (C) 2002-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* Implemented from the specification included in the Intel C++ Compiler - User Guide and Reference, version 9.0. */ - -#ifndef _MMINTRIN_H_INCLUDED -#define _MMINTRIN_H_INCLUDED - -#ifndef __MMX__ -# error "MMX instruction set not enabled" -#else -/* The Intel API is flexible enough that we must allow aliasing with other - vector types, and their scalar components. */ -typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__)); - -/* Internal data types for implementing the intrinsics. */ -typedef int __v2si __attribute__ ((__vector_size__ (8))); -typedef short __v4hi __attribute__ ((__vector_size__ (8))); -typedef char __v8qi __attribute__ ((__vector_size__ (8))); -typedef long long __v1di __attribute__ ((__vector_size__ (8))); -typedef float __v2sf __attribute__ ((__vector_size__ (8))); - -/* Empty the multimedia state. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_empty (void) -{ - __builtin_ia32_emms (); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_empty (void) -{ - _mm_empty (); -} - -/* Convert I to a __m64 object. The integer is zero-extended to 64-bits. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi32_si64 (int __i) -{ - return (__m64) __builtin_ia32_vec_init_v2si (__i, 0); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_from_int (int __i) -{ - return _mm_cvtsi32_si64 (__i); -} - -#ifdef __x86_64__ -/* Convert I to a __m64 object. */ - -/* Intel intrinsic. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_from_int64 (long long __i) -{ - return (__m64) __i; -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi64_m64 (long long __i) -{ - return (__m64) __i; -} - -/* Microsoft intrinsic. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi64x_si64 (long long __i) -{ - return (__m64) __i; -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_pi64x (long long __i) -{ - return (__m64) __i; -} -#endif - -/* Convert the lower 32 bits of the __m64 object into an integer. */ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi64_si32 (__m64 __i) -{ - return __builtin_ia32_vec_ext_v2si ((__v2si)__i, 0); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_to_int (__m64 __i) -{ - return _mm_cvtsi64_si32 (__i); -} - -#ifdef __x86_64__ -/* Convert the __m64 object to a 64bit integer. */ - -/* Intel intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_to_int64 (__m64 __i) -{ - return (long long)__i; -} - -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtm64_si64 (__m64 __i) -{ - return (long long)__i; -} - -/* Microsoft intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi64_si64x (__m64 __i) -{ - return (long long)__i; -} -#endif - -/* Pack the four 16-bit values from M1 into the lower four 8-bit values of - the result, and the four 16-bit values from M2 into the upper four 8-bit - values of the result, all with signed saturation. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_packs_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_packsswb ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_packsswb (__m64 __m1, __m64 __m2) -{ - return _mm_packs_pi16 (__m1, __m2); -} - -/* Pack the two 32-bit values from M1 in to the lower two 16-bit values of - the result, and the two 32-bit values from M2 into the upper two 16-bit - values of the result, all with signed saturation. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_packs_pi32 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_packssdw ((__v2si)__m1, (__v2si)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_packssdw (__m64 __m1, __m64 __m2) -{ - return _mm_packs_pi32 (__m1, __m2); -} - -/* Pack the four 16-bit values from M1 into the lower four 8-bit values of - the result, and the four 16-bit values from M2 into the upper four 8-bit - values of the result, all with unsigned saturation. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_packs_pu16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_packuswb ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_packuswb (__m64 __m1, __m64 __m2) -{ - return _mm_packs_pu16 (__m1, __m2); -} - -/* Interleave the four 8-bit values from the high half of M1 with the four - 8-bit values from the high half of M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpackhi_pi8 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_punpckhbw ((__v8qi)__m1, (__v8qi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_punpckhbw (__m64 __m1, __m64 __m2) -{ - return _mm_unpackhi_pi8 (__m1, __m2); -} - -/* Interleave the two 16-bit values from the high half of M1 with the two - 16-bit values from the high half of M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpackhi_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_punpckhwd ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_punpckhwd (__m64 __m1, __m64 __m2) -{ - return _mm_unpackhi_pi16 (__m1, __m2); -} - -/* Interleave the 32-bit value from the high half of M1 with the 32-bit - value from the high half of M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpackhi_pi32 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_punpckhdq ((__v2si)__m1, (__v2si)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_punpckhdq (__m64 __m1, __m64 __m2) -{ - return _mm_unpackhi_pi32 (__m1, __m2); -} - -/* Interleave the four 8-bit values from the low half of M1 with the four - 8-bit values from the low half of M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpacklo_pi8 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_punpcklbw ((__v8qi)__m1, (__v8qi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_punpcklbw (__m64 __m1, __m64 __m2) -{ - return _mm_unpacklo_pi8 (__m1, __m2); -} - -/* Interleave the two 16-bit values from the low half of M1 with the two - 16-bit values from the low half of M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpacklo_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_punpcklwd ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_punpcklwd (__m64 __m1, __m64 __m2) -{ - return _mm_unpacklo_pi16 (__m1, __m2); -} - -/* Interleave the 32-bit value from the low half of M1 with the 32-bit - value from the low half of M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpacklo_pi32 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_punpckldq ((__v2si)__m1, (__v2si)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_punpckldq (__m64 __m1, __m64 __m2) -{ - return _mm_unpacklo_pi32 (__m1, __m2); -} - -/* Add the 8-bit values in M1 to the 8-bit values in M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_pi8 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_paddb ((__v8qi)__m1, (__v8qi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_paddb (__m64 __m1, __m64 __m2) -{ - return _mm_add_pi8 (__m1, __m2); -} - -/* Add the 16-bit values in M1 to the 16-bit values in M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_paddw ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_paddw (__m64 __m1, __m64 __m2) -{ - return _mm_add_pi16 (__m1, __m2); -} - -/* Add the 32-bit values in M1 to the 32-bit values in M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_pi32 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_paddd ((__v2si)__m1, (__v2si)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_paddd (__m64 __m1, __m64 __m2) -{ - return _mm_add_pi32 (__m1, __m2); -} - -/* Add the 64-bit values in M1 to the 64-bit values in M2. */ -#ifdef __SSE2__ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_si64 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_paddq ((__v1di)__m1, (__v1di)__m2); -} -#endif - -/* Add the 8-bit values in M1 to the 8-bit values in M2 using signed - saturated arithmetic. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_adds_pi8 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_paddsb ((__v8qi)__m1, (__v8qi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_paddsb (__m64 __m1, __m64 __m2) -{ - return _mm_adds_pi8 (__m1, __m2); -} - -/* Add the 16-bit values in M1 to the 16-bit values in M2 using signed - saturated arithmetic. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_adds_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_paddsw ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_paddsw (__m64 __m1, __m64 __m2) -{ - return _mm_adds_pi16 (__m1, __m2); -} - -/* Add the 8-bit values in M1 to the 8-bit values in M2 using unsigned - saturated arithmetic. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_adds_pu8 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_paddusb ((__v8qi)__m1, (__v8qi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_paddusb (__m64 __m1, __m64 __m2) -{ - return _mm_adds_pu8 (__m1, __m2); -} - -/* Add the 16-bit values in M1 to the 16-bit values in M2 using unsigned - saturated arithmetic. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_adds_pu16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_paddusw ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_paddusw (__m64 __m1, __m64 __m2) -{ - return _mm_adds_pu16 (__m1, __m2); -} - -/* Subtract the 8-bit values in M2 from the 8-bit values in M1. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_pi8 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_psubb ((__v8qi)__m1, (__v8qi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psubb (__m64 __m1, __m64 __m2) -{ - return _mm_sub_pi8 (__m1, __m2); -} - -/* Subtract the 16-bit values in M2 from the 16-bit values in M1. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_psubw ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psubw (__m64 __m1, __m64 __m2) -{ - return _mm_sub_pi16 (__m1, __m2); -} - -/* Subtract the 32-bit values in M2 from the 32-bit values in M1. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_pi32 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_psubd ((__v2si)__m1, (__v2si)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psubd (__m64 __m1, __m64 __m2) -{ - return _mm_sub_pi32 (__m1, __m2); -} - -/* Add the 64-bit values in M1 to the 64-bit values in M2. */ -#ifdef __SSE2__ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_si64 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_psubq ((__v1di)__m1, (__v1di)__m2); -} -#endif - -/* Subtract the 8-bit values in M2 from the 8-bit values in M1 using signed - saturating arithmetic. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_subs_pi8 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_psubsb ((__v8qi)__m1, (__v8qi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psubsb (__m64 __m1, __m64 __m2) -{ - return _mm_subs_pi8 (__m1, __m2); -} - -/* Subtract the 16-bit values in M2 from the 16-bit values in M1 using - signed saturating arithmetic. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_subs_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_psubsw ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psubsw (__m64 __m1, __m64 __m2) -{ - return _mm_subs_pi16 (__m1, __m2); -} - -/* Subtract the 8-bit values in M2 from the 8-bit values in M1 using - unsigned saturating arithmetic. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_subs_pu8 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_psubusb ((__v8qi)__m1, (__v8qi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psubusb (__m64 __m1, __m64 __m2) -{ - return _mm_subs_pu8 (__m1, __m2); -} - -/* Subtract the 16-bit values in M2 from the 16-bit values in M1 using - unsigned saturating arithmetic. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_subs_pu16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_psubusw ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psubusw (__m64 __m1, __m64 __m2) -{ - return _mm_subs_pu16 (__m1, __m2); -} - -/* Multiply four 16-bit values in M1 by four 16-bit values in M2 producing - four 32-bit intermediate results, which are then summed by pairs to - produce two 32-bit results. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_madd_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_pmaddwd ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pmaddwd (__m64 __m1, __m64 __m2) -{ - return _mm_madd_pi16 (__m1, __m2); -} - -/* Multiply four signed 16-bit values in M1 by four signed 16-bit values in - M2 and produce the high 16 bits of the 32-bit results. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mulhi_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_pmulhw ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pmulhw (__m64 __m1, __m64 __m2) -{ - return _mm_mulhi_pi16 (__m1, __m2); -} - -/* Multiply four 16-bit values in M1 by four 16-bit values in M2 and produce - the low 16 bits of the results. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mullo_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_pmullw ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pmullw (__m64 __m1, __m64 __m2) -{ - return _mm_mullo_pi16 (__m1, __m2); -} - -/* Shift four 16-bit values in M left by COUNT. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sll_pi16 (__m64 __m, __m64 __count) -{ - return (__m64) __builtin_ia32_psllw ((__v4hi)__m, (__v4hi)__count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psllw (__m64 __m, __m64 __count) -{ - return _mm_sll_pi16 (__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_slli_pi16 (__m64 __m, int __count) -{ - return (__m64) __builtin_ia32_psllwi ((__v4hi)__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psllwi (__m64 __m, int __count) -{ - return _mm_slli_pi16 (__m, __count); -} - -/* Shift two 32-bit values in M left by COUNT. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sll_pi32 (__m64 __m, __m64 __count) -{ - return (__m64) __builtin_ia32_pslld ((__v2si)__m, (__v2si)__count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pslld (__m64 __m, __m64 __count) -{ - return _mm_sll_pi32 (__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_slli_pi32 (__m64 __m, int __count) -{ - return (__m64) __builtin_ia32_pslldi ((__v2si)__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pslldi (__m64 __m, int __count) -{ - return _mm_slli_pi32 (__m, __count); -} - -/* Shift the 64-bit value in M left by COUNT. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sll_si64 (__m64 __m, __m64 __count) -{ - return (__m64) __builtin_ia32_psllq ((__v1di)__m, (__v1di)__count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psllq (__m64 __m, __m64 __count) -{ - return _mm_sll_si64 (__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_slli_si64 (__m64 __m, int __count) -{ - return (__m64) __builtin_ia32_psllqi ((__v1di)__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psllqi (__m64 __m, int __count) -{ - return _mm_slli_si64 (__m, __count); -} - -/* Shift four 16-bit values in M right by COUNT; shift in the sign bit. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sra_pi16 (__m64 __m, __m64 __count) -{ - return (__m64) __builtin_ia32_psraw ((__v4hi)__m, (__v4hi)__count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psraw (__m64 __m, __m64 __count) -{ - return _mm_sra_pi16 (__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srai_pi16 (__m64 __m, int __count) -{ - return (__m64) __builtin_ia32_psrawi ((__v4hi)__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psrawi (__m64 __m, int __count) -{ - return _mm_srai_pi16 (__m, __count); -} - -/* Shift two 32-bit values in M right by COUNT; shift in the sign bit. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sra_pi32 (__m64 __m, __m64 __count) -{ - return (__m64) __builtin_ia32_psrad ((__v2si)__m, (__v2si)__count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psrad (__m64 __m, __m64 __count) -{ - return _mm_sra_pi32 (__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srai_pi32 (__m64 __m, int __count) -{ - return (__m64) __builtin_ia32_psradi ((__v2si)__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psradi (__m64 __m, int __count) -{ - return _mm_srai_pi32 (__m, __count); -} - -/* Shift four 16-bit values in M right by COUNT; shift in zeros. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srl_pi16 (__m64 __m, __m64 __count) -{ - return (__m64) __builtin_ia32_psrlw ((__v4hi)__m, (__v4hi)__count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psrlw (__m64 __m, __m64 __count) -{ - return _mm_srl_pi16 (__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srli_pi16 (__m64 __m, int __count) -{ - return (__m64) __builtin_ia32_psrlwi ((__v4hi)__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psrlwi (__m64 __m, int __count) -{ - return _mm_srli_pi16 (__m, __count); -} - -/* Shift two 32-bit values in M right by COUNT; shift in zeros. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srl_pi32 (__m64 __m, __m64 __count) -{ - return (__m64) __builtin_ia32_psrld ((__v2si)__m, (__v2si)__count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psrld (__m64 __m, __m64 __count) -{ - return _mm_srl_pi32 (__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srli_pi32 (__m64 __m, int __count) -{ - return (__m64) __builtin_ia32_psrldi ((__v2si)__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psrldi (__m64 __m, int __count) -{ - return _mm_srli_pi32 (__m, __count); -} - -/* Shift the 64-bit value in M left by COUNT; shift in zeros. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srl_si64 (__m64 __m, __m64 __count) -{ - return (__m64) __builtin_ia32_psrlq ((__v1di)__m, (__v1di)__count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psrlq (__m64 __m, __m64 __count) -{ - return _mm_srl_si64 (__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_srli_si64 (__m64 __m, int __count) -{ - return (__m64) __builtin_ia32_psrlqi ((__v1di)__m, __count); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psrlqi (__m64 __m, int __count) -{ - return _mm_srli_si64 (__m, __count); -} - -/* Bit-wise AND the 64-bit values in M1 and M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_and_si64 (__m64 __m1, __m64 __m2) -{ - return __builtin_ia32_pand (__m1, __m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pand (__m64 __m1, __m64 __m2) -{ - return _mm_and_si64 (__m1, __m2); -} - -/* Bit-wise complement the 64-bit value in M1 and bit-wise AND it with the - 64-bit value in M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_andnot_si64 (__m64 __m1, __m64 __m2) -{ - return __builtin_ia32_pandn (__m1, __m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pandn (__m64 __m1, __m64 __m2) -{ - return _mm_andnot_si64 (__m1, __m2); -} - -/* Bit-wise inclusive OR the 64-bit values in M1 and M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_or_si64 (__m64 __m1, __m64 __m2) -{ - return __builtin_ia32_por (__m1, __m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_por (__m64 __m1, __m64 __m2) -{ - return _mm_or_si64 (__m1, __m2); -} - -/* Bit-wise exclusive OR the 64-bit values in M1 and M2. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_xor_si64 (__m64 __m1, __m64 __m2) -{ - return __builtin_ia32_pxor (__m1, __m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pxor (__m64 __m1, __m64 __m2) -{ - return _mm_xor_si64 (__m1, __m2); -} - -/* Compare eight 8-bit values. The result of the comparison is 0xFF if the - test is true and zero if false. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpeq_pi8 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_pcmpeqb ((__v8qi)__m1, (__v8qi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pcmpeqb (__m64 __m1, __m64 __m2) -{ - return _mm_cmpeq_pi8 (__m1, __m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpgt_pi8 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_pcmpgtb ((__v8qi)__m1, (__v8qi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pcmpgtb (__m64 __m1, __m64 __m2) -{ - return _mm_cmpgt_pi8 (__m1, __m2); -} - -/* Compare four 16-bit values. The result of the comparison is 0xFFFF if - the test is true and zero if false. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpeq_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_pcmpeqw ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pcmpeqw (__m64 __m1, __m64 __m2) -{ - return _mm_cmpeq_pi16 (__m1, __m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpgt_pi16 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_pcmpgtw ((__v4hi)__m1, (__v4hi)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pcmpgtw (__m64 __m1, __m64 __m2) -{ - return _mm_cmpgt_pi16 (__m1, __m2); -} - -/* Compare two 32-bit values. The result of the comparison is 0xFFFFFFFF if - the test is true and zero if false. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpeq_pi32 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_pcmpeqd ((__v2si)__m1, (__v2si)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pcmpeqd (__m64 __m1, __m64 __m2) -{ - return _mm_cmpeq_pi32 (__m1, __m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpgt_pi32 (__m64 __m1, __m64 __m2) -{ - return (__m64) __builtin_ia32_pcmpgtd ((__v2si)__m1, (__v2si)__m2); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pcmpgtd (__m64 __m1, __m64 __m2) -{ - return _mm_cmpgt_pi32 (__m1, __m2); -} - -/* Creates a 64-bit zero. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setzero_si64 (void) -{ - return (__m64)0LL; -} - -/* Creates a vector of two 32-bit values; I0 is least significant. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_pi32 (int __i1, int __i0) -{ - return (__m64) __builtin_ia32_vec_init_v2si (__i0, __i1); -} - -/* Creates a vector of four 16-bit values; W0 is least significant. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_pi16 (short __w3, short __w2, short __w1, short __w0) -{ - return (__m64) __builtin_ia32_vec_init_v4hi (__w0, __w1, __w2, __w3); -} - -/* Creates a vector of eight 8-bit values; B0 is least significant. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_pi8 (char __b7, char __b6, char __b5, char __b4, - char __b3, char __b2, char __b1, char __b0) -{ - return (__m64) __builtin_ia32_vec_init_v8qi (__b0, __b1, __b2, __b3, - __b4, __b5, __b6, __b7); -} - -/* Similar, but with the arguments in reverse order. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setr_pi32 (int __i0, int __i1) -{ - return _mm_set_pi32 (__i1, __i0); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setr_pi16 (short __w0, short __w1, short __w2, short __w3) -{ - return _mm_set_pi16 (__w3, __w2, __w1, __w0); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setr_pi8 (char __b0, char __b1, char __b2, char __b3, - char __b4, char __b5, char __b6, char __b7) -{ - return _mm_set_pi8 (__b7, __b6, __b5, __b4, __b3, __b2, __b1, __b0); -} - -/* Creates a vector of two 32-bit values, both elements containing I. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set1_pi32 (int __i) -{ - return _mm_set_pi32 (__i, __i); -} - -/* Creates a vector of four 16-bit values, all elements containing W. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set1_pi16 (short __w) -{ - return _mm_set_pi16 (__w, __w, __w, __w); -} - -/* Creates a vector of eight 8-bit values, all elements containing B. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set1_pi8 (char __b) -{ - return _mm_set_pi8 (__b, __b, __b, __b, __b, __b, __b, __b); -} - -#endif /* __MMX__ */ -#endif /* _MMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/nmmintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/nmmintrin.h deleted file mode 100644 index a4fbed2..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/nmmintrin.h +++ /dev/null @@ -1,37 +0,0 @@ -/* Copyright (C) 2007-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* Implemented from the specification included in the Intel C++ Compiler - User Guide and Reference, version 10.0. */ - -#ifndef _NMMINTRIN_H_INCLUDED -#define _NMMINTRIN_H_INCLUDED - -#ifndef __SSE4_2__ -# error "SSE4.2 instruction set not enabled" -#else -/* We just include SSE4.1 header file. */ -#include <smmintrin.h> -#endif /* __SSE4_2__ */ - -#endif /* _NMMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/omp.h b/lib/gcc/x86_64-linux-android/4.8/include/omp.h deleted file mode 100644 index 11ab7b8..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/omp.h +++ /dev/null @@ -1,107 +0,0 @@ -/* Copyright (C) 2005-2013 Free Software Foundation, Inc. - Contributed by Richard Henderson <rth@redhat.com>. - - This file is part of the GNU OpenMP Library (libgomp). - - Libgomp is free software; you can redistribute it and/or modify it - under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY - WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS - FOR A PARTICULAR PURPOSE. See the GNU General Public License for - more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef OMP_H -#define OMP_H 1 - -#ifndef _LIBGOMP_OMP_LOCK_DEFINED -#define _LIBGOMP_OMP_LOCK_DEFINED 1 -/* These two structures get edited by the libgomp build process to - reflect the shape of the two types. Their internals are private - to the library. */ - -typedef struct -{ - unsigned char _x[4] - __attribute__((__aligned__(4))); -} omp_lock_t; - -typedef struct -{ - unsigned char _x[12] - __attribute__((__aligned__(4))); -} omp_nest_lock_t; -#endif - -typedef enum omp_sched_t -{ - omp_sched_static = 1, - omp_sched_dynamic = 2, - omp_sched_guided = 3, - omp_sched_auto = 4 -} omp_sched_t; - -#ifdef __cplusplus -extern "C" { -# define __GOMP_NOTHROW throw () -#else -# define __GOMP_NOTHROW __attribute__((__nothrow__)) -#endif - -extern void omp_set_num_threads (int) __GOMP_NOTHROW; -extern int omp_get_num_threads (void) __GOMP_NOTHROW; -extern int omp_get_max_threads (void) __GOMP_NOTHROW; -extern int omp_get_thread_num (void) __GOMP_NOTHROW; -extern int omp_get_num_procs (void) __GOMP_NOTHROW; - -extern int omp_in_parallel (void) __GOMP_NOTHROW; - -extern void omp_set_dynamic (int) __GOMP_NOTHROW; -extern int omp_get_dynamic (void) __GOMP_NOTHROW; - -extern void omp_set_nested (int) __GOMP_NOTHROW; -extern int omp_get_nested (void) __GOMP_NOTHROW; - -extern void omp_init_lock (omp_lock_t *) __GOMP_NOTHROW; -extern void omp_destroy_lock (omp_lock_t *) __GOMP_NOTHROW; -extern void omp_set_lock (omp_lock_t *) __GOMP_NOTHROW; -extern void omp_unset_lock (omp_lock_t *) __GOMP_NOTHROW; -extern int omp_test_lock (omp_lock_t *) __GOMP_NOTHROW; - -extern void omp_init_nest_lock (omp_nest_lock_t *) __GOMP_NOTHROW; -extern void omp_destroy_nest_lock (omp_nest_lock_t *) __GOMP_NOTHROW; -extern void omp_set_nest_lock (omp_nest_lock_t *) __GOMP_NOTHROW; -extern void omp_unset_nest_lock (omp_nest_lock_t *) __GOMP_NOTHROW; -extern int omp_test_nest_lock (omp_nest_lock_t *) __GOMP_NOTHROW; - -extern double omp_get_wtime (void) __GOMP_NOTHROW; -extern double omp_get_wtick (void) __GOMP_NOTHROW; - -void omp_set_schedule (omp_sched_t, int) __GOMP_NOTHROW; -void omp_get_schedule (omp_sched_t *, int *) __GOMP_NOTHROW; -int omp_get_thread_limit (void) __GOMP_NOTHROW; -void omp_set_max_active_levels (int) __GOMP_NOTHROW; -int omp_get_max_active_levels (void) __GOMP_NOTHROW; -int omp_get_level (void) __GOMP_NOTHROW; -int omp_get_ancestor_thread_num (int) __GOMP_NOTHROW; -int omp_get_team_size (int) __GOMP_NOTHROW; -int omp_get_active_level (void) __GOMP_NOTHROW; - -int omp_in_final (void) __GOMP_NOTHROW; - -#ifdef __cplusplus -} -#endif - -#endif /* OMP_H */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/pmmintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/pmmintrin.h deleted file mode 100644 index 9c6956c..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/pmmintrin.h +++ /dev/null @@ -1,127 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* Implemented from the specification included in the Intel C++ Compiler - User Guide and Reference, version 9.0. */ - -#ifndef _PMMINTRIN_H_INCLUDED -#define _PMMINTRIN_H_INCLUDED - -#ifndef __SSE3__ -# error "SSE3 instruction set not enabled" -#else - -/* We need definitions from the SSE2 and SSE header files*/ -#include <emmintrin.h> - -/* Additional bits in the MXCSR. */ -#define _MM_DENORMALS_ZERO_MASK 0x0040 -#define _MM_DENORMALS_ZERO_ON 0x0040 -#define _MM_DENORMALS_ZERO_OFF 0x0000 - -#define _MM_SET_DENORMALS_ZERO_MODE(mode) \ - _mm_setcsr ((_mm_getcsr () & ~_MM_DENORMALS_ZERO_MASK) | (mode)) -#define _MM_GET_DENORMALS_ZERO_MODE() \ - (_mm_getcsr() & _MM_DENORMALS_ZERO_MASK) - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_addsub_ps (__m128 __X, __m128 __Y) -{ - return (__m128) __builtin_ia32_addsubps ((__v4sf)__X, (__v4sf)__Y); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hadd_ps (__m128 __X, __m128 __Y) -{ - return (__m128) __builtin_ia32_haddps ((__v4sf)__X, (__v4sf)__Y); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hsub_ps (__m128 __X, __m128 __Y) -{ - return (__m128) __builtin_ia32_hsubps ((__v4sf)__X, (__v4sf)__Y); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_movehdup_ps (__m128 __X) -{ - return (__m128) __builtin_ia32_movshdup ((__v4sf)__X); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_moveldup_ps (__m128 __X) -{ - return (__m128) __builtin_ia32_movsldup ((__v4sf)__X); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_addsub_pd (__m128d __X, __m128d __Y) -{ - return (__m128d) __builtin_ia32_addsubpd ((__v2df)__X, (__v2df)__Y); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hadd_pd (__m128d __X, __m128d __Y) -{ - return (__m128d) __builtin_ia32_haddpd ((__v2df)__X, (__v2df)__Y); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hsub_pd (__m128d __X, __m128d __Y) -{ - return (__m128d) __builtin_ia32_hsubpd ((__v2df)__X, (__v2df)__Y); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_loaddup_pd (double const *__P) -{ - return _mm_load1_pd (__P); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_movedup_pd (__m128d __X) -{ - return _mm_shuffle_pd (__X, __X, _MM_SHUFFLE2 (0,0)); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_lddqu_si128 (__m128i const *__P) -{ - return (__m128i) __builtin_ia32_lddqu ((char const *)__P); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H) -{ - __builtin_ia32_monitor (__P, __E, __H); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mwait (unsigned int __E, unsigned int __H) -{ - __builtin_ia32_mwait (__E, __H); -} - -#endif /* __SSE3__ */ - -#endif /* _PMMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/popcntintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/popcntintrin.h deleted file mode 100644 index af7efdf..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/popcntintrin.h +++ /dev/null @@ -1,46 +0,0 @@ -/* Copyright (C) 2009-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef __POPCNT__ -# error "POPCNT instruction set not enabled" -#endif /* __POPCNT__ */ - -#ifndef _POPCNTINTRIN_H_INCLUDED -#define _POPCNTINTRIN_H_INCLUDED - -/* Calculate a number of bits set to 1. */ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_popcnt_u32 (unsigned int __X) -{ - return __builtin_popcount (__X); -} - -#ifdef __x86_64__ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_popcnt_u64 (unsigned long long __X) -{ - return __builtin_popcountll (__X); -} -#endif - -#endif /* _POPCNTINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/prfchwintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/prfchwintrin.h deleted file mode 100644 index b8011bb..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/prfchwintrin.h +++ /dev/null @@ -1,42 +0,0 @@ -/* Copyright (C) 2012-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#if !defined _X86INTRIN_H_INCLUDED && !defined _MM3DNOW_H_INCLUDED -# error "Never use <prfchwintrin.h> directly; include <x86intrin.h> or <mm3dnow.h> instead." -#endif - - -#if !defined (__PRFCHW__) && !defined (__3dNOW__) -# error "PRFCHW instruction not enabled" -#endif /* __PRFCHW__ or __3dNOW__*/ - -#ifndef _PRFCHWINTRIN_H_INCLUDED -#define _PRFCHWINTRIN_H_INCLUDED - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_prefetchw (void *__P) -{ - __builtin_prefetch (__P, 1, 3 /* _MM_HINT_T0 */); -} - -#endif /* _PRFCHWINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/rdseedintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/rdseedintrin.h deleted file mode 100644 index f30c237..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/rdseedintrin.h +++ /dev/null @@ -1,58 +0,0 @@ -/* Copyright (C) 2012-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#if !defined _X86INTRIN_H_INCLUDED -# error "Never use <rdseedintrin.h> directly; include <x86intrin.h> instead." -#endif - -#ifndef __RDSEED__ -# error "RDSEED instruction not enabled" -#endif /* __RDSEED__ */ - -#ifndef _RDSEEDINTRIN_H_INCLUDED -#define _RDSEEDINTRIN_H_INCLUDED - -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_rdseed16_step (unsigned short *p) -{ - return __builtin_ia32_rdseed_hi_step (p); -} - -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_rdseed32_step (unsigned int *p) -{ - return __builtin_ia32_rdseed_si_step (p); -} - -#ifdef __x86_64__ -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_rdseed64_step (unsigned long long *p) -{ - return __builtin_ia32_rdseed_di_step (p); -} -#endif - -#endif /* _RDSEEDINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/rtmintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/rtmintrin.h deleted file mode 100644 index 003a771..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/rtmintrin.h +++ /dev/null @@ -1,77 +0,0 @@ -/* Copyright (C) 2012-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _IMMINTRIN_H_INCLUDED -# error "Never use <rtmintrin.h> directly; include <immintrin.h> instead." -#endif - -#ifndef __RTM__ -# error "RTM instruction set not enabled" -#endif /* __RTM__ */ - -#ifndef _RTMINTRIN_H_INCLUDED -#define _RTMINTRIN_H_INCLUDED - -#define _XBEGIN_STARTED (~0u) -#define _XABORT_EXPLICIT (1 << 0) -#define _XABORT_RETRY (1 << 1) -#define _XABORT_CONFLICT (1 << 2) -#define _XABORT_CAPACITY (1 << 3) -#define _XABORT_DEBUG (1 << 4) -#define _XABORT_NESTED (1 << 5) -#define _XABORT_CODE(x) (((x) >> 24) & 0xFF) - -/* Start an RTM code region. Return _XBEGIN_STARTED on success and the - abort condition otherwise. */ -extern __inline unsigned int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_xbegin (void) -{ - return __builtin_ia32_xbegin (); -} - -/* Specify the end of an RTM code region. If it corresponds to the - outermost transaction, then attempts the transaction commit. If the - commit fails, then control is transferred to the outermost transaction - fallback handler. */ -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_xend (void) -{ - __builtin_ia32_xend (); -} - -/* Force an RTM abort condition. The control is transferred to the - outermost transaction fallback handler with the abort condition IMM. */ -#ifdef __OPTIMIZE__ -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_xabort (const unsigned int imm) -{ - __builtin_ia32_xabort (imm); -} -#else -#define _xabort(N) __builtin_ia32_xabort (N) -#endif /* __OPTIMIZE__ */ - -#endif /* _RTMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/smmintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/smmintrin.h deleted file mode 100644 index 3ae916c..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/smmintrin.h +++ /dev/null @@ -1,830 +0,0 @@ -/* Copyright (C) 2007-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* Implemented from the specification included in the Intel C++ Compiler - User Guide and Reference, version 10.0. */ - -#ifndef _SMMINTRIN_H_INCLUDED -#define _SMMINTRIN_H_INCLUDED - -#ifndef __SSE4_1__ -# error "SSE4.1 instruction set not enabled" -#else - -/* We need definitions from the SSSE3, SSE3, SSE2 and SSE header - files. */ -#include <tmmintrin.h> - -/* Rounding mode macros. */ -#define _MM_FROUND_TO_NEAREST_INT 0x00 -#define _MM_FROUND_TO_NEG_INF 0x01 -#define _MM_FROUND_TO_POS_INF 0x02 -#define _MM_FROUND_TO_ZERO 0x03 -#define _MM_FROUND_CUR_DIRECTION 0x04 - -#define _MM_FROUND_RAISE_EXC 0x00 -#define _MM_FROUND_NO_EXC 0x08 - -#define _MM_FROUND_NINT \ - (_MM_FROUND_TO_NEAREST_INT | _MM_FROUND_RAISE_EXC) -#define _MM_FROUND_FLOOR \ - (_MM_FROUND_TO_NEG_INF | _MM_FROUND_RAISE_EXC) -#define _MM_FROUND_CEIL \ - (_MM_FROUND_TO_POS_INF | _MM_FROUND_RAISE_EXC) -#define _MM_FROUND_TRUNC \ - (_MM_FROUND_TO_ZERO | _MM_FROUND_RAISE_EXC) -#define _MM_FROUND_RINT \ - (_MM_FROUND_CUR_DIRECTION | _MM_FROUND_RAISE_EXC) -#define _MM_FROUND_NEARBYINT \ - (_MM_FROUND_CUR_DIRECTION | _MM_FROUND_NO_EXC) - -/* Test Instruction */ -/* Packed integer 128-bit bitwise comparison. Return 1 if - (__V & __M) == 0. */ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_testz_si128 (__m128i __M, __m128i __V) -{ - return __builtin_ia32_ptestz128 ((__v2di)__M, (__v2di)__V); -} - -/* Packed integer 128-bit bitwise comparison. Return 1 if - (__V & ~__M) == 0. */ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_testc_si128 (__m128i __M, __m128i __V) -{ - return __builtin_ia32_ptestc128 ((__v2di)__M, (__v2di)__V); -} - -/* Packed integer 128-bit bitwise comparison. Return 1 if - (__V & __M) != 0 && (__V & ~__M) != 0. */ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_testnzc_si128 (__m128i __M, __m128i __V) -{ - return __builtin_ia32_ptestnzc128 ((__v2di)__M, (__v2di)__V); -} - -/* Macros for packed integer 128-bit comparison intrinsics. */ -#define _mm_test_all_zeros(M, V) _mm_testz_si128 ((M), (V)) - -#define _mm_test_all_ones(V) \ - _mm_testc_si128 ((V), _mm_cmpeq_epi32 ((V), (V))) - -#define _mm_test_mix_ones_zeros(M, V) _mm_testnzc_si128 ((M), (V)) - -/* Packed/scalar double precision floating point rounding. */ - -#ifdef __OPTIMIZE__ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_round_pd (__m128d __V, const int __M) -{ - return (__m128d) __builtin_ia32_roundpd ((__v2df)__V, __M); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_round_sd(__m128d __D, __m128d __V, const int __M) -{ - return (__m128d) __builtin_ia32_roundsd ((__v2df)__D, - (__v2df)__V, - __M); -} -#else -#define _mm_round_pd(V, M) \ - ((__m128d) __builtin_ia32_roundpd ((__v2df)(__m128d)(V), (int)(M))) - -#define _mm_round_sd(D, V, M) \ - ((__m128d) __builtin_ia32_roundsd ((__v2df)(__m128d)(D), \ - (__v2df)(__m128d)(V), (int)(M))) -#endif - -/* Packed/scalar single precision floating point rounding. */ - -#ifdef __OPTIMIZE__ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_round_ps (__m128 __V, const int __M) -{ - return (__m128) __builtin_ia32_roundps ((__v4sf)__V, __M); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_round_ss (__m128 __D, __m128 __V, const int __M) -{ - return (__m128) __builtin_ia32_roundss ((__v4sf)__D, - (__v4sf)__V, - __M); -} -#else -#define _mm_round_ps(V, M) \ - ((__m128) __builtin_ia32_roundps ((__v4sf)(__m128)(V), (int)(M))) - -#define _mm_round_ss(D, V, M) \ - ((__m128) __builtin_ia32_roundss ((__v4sf)(__m128)(D), \ - (__v4sf)(__m128)(V), (int)(M))) -#endif - -/* Macros for ceil/floor intrinsics. */ -#define _mm_ceil_pd(V) _mm_round_pd ((V), _MM_FROUND_CEIL) -#define _mm_ceil_sd(D, V) _mm_round_sd ((D), (V), _MM_FROUND_CEIL) - -#define _mm_floor_pd(V) _mm_round_pd((V), _MM_FROUND_FLOOR) -#define _mm_floor_sd(D, V) _mm_round_sd ((D), (V), _MM_FROUND_FLOOR) - -#define _mm_ceil_ps(V) _mm_round_ps ((V), _MM_FROUND_CEIL) -#define _mm_ceil_ss(D, V) _mm_round_ss ((D), (V), _MM_FROUND_CEIL) - -#define _mm_floor_ps(V) _mm_round_ps ((V), _MM_FROUND_FLOOR) -#define _mm_floor_ss(D, V) _mm_round_ss ((D), (V), _MM_FROUND_FLOOR) - -/* SSE4.1 */ - -/* Integer blend instructions - select data from 2 sources using - constant/variable mask. */ - -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_blend_epi16 (__m128i __X, __m128i __Y, const int __M) -{ - return (__m128i) __builtin_ia32_pblendw128 ((__v8hi)__X, - (__v8hi)__Y, - __M); -} -#else -#define _mm_blend_epi16(X, Y, M) \ - ((__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(X), \ - (__v8hi)(__m128i)(Y), (int)(M))) -#endif - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_blendv_epi8 (__m128i __X, __m128i __Y, __m128i __M) -{ - return (__m128i) __builtin_ia32_pblendvb128 ((__v16qi)__X, - (__v16qi)__Y, - (__v16qi)__M); -} - -/* Single precision floating point blend instructions - select data - from 2 sources using constant/variable mask. */ - -#ifdef __OPTIMIZE__ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_blend_ps (__m128 __X, __m128 __Y, const int __M) -{ - return (__m128) __builtin_ia32_blendps ((__v4sf)__X, - (__v4sf)__Y, - __M); -} -#else -#define _mm_blend_ps(X, Y, M) \ - ((__m128) __builtin_ia32_blendps ((__v4sf)(__m128)(X), \ - (__v4sf)(__m128)(Y), (int)(M))) -#endif - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_blendv_ps (__m128 __X, __m128 __Y, __m128 __M) -{ - return (__m128) __builtin_ia32_blendvps ((__v4sf)__X, - (__v4sf)__Y, - (__v4sf)__M); -} - -/* Double precision floating point blend instructions - select data - from 2 sources using constant/variable mask. */ - -#ifdef __OPTIMIZE__ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_blend_pd (__m128d __X, __m128d __Y, const int __M) -{ - return (__m128d) __builtin_ia32_blendpd ((__v2df)__X, - (__v2df)__Y, - __M); -} -#else -#define _mm_blend_pd(X, Y, M) \ - ((__m128d) __builtin_ia32_blendpd ((__v2df)(__m128d)(X), \ - (__v2df)(__m128d)(Y), (int)(M))) -#endif - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_blendv_pd (__m128d __X, __m128d __Y, __m128d __M) -{ - return (__m128d) __builtin_ia32_blendvpd ((__v2df)__X, - (__v2df)__Y, - (__v2df)__M); -} - -/* Dot product instructions with mask-defined summing and zeroing parts - of result. */ - -#ifdef __OPTIMIZE__ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_dp_ps (__m128 __X, __m128 __Y, const int __M) -{ - return (__m128) __builtin_ia32_dpps ((__v4sf)__X, - (__v4sf)__Y, - __M); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_dp_pd (__m128d __X, __m128d __Y, const int __M) -{ - return (__m128d) __builtin_ia32_dppd ((__v2df)__X, - (__v2df)__Y, - __M); -} -#else -#define _mm_dp_ps(X, Y, M) \ - ((__m128) __builtin_ia32_dpps ((__v4sf)(__m128)(X), \ - (__v4sf)(__m128)(Y), (int)(M))) - -#define _mm_dp_pd(X, Y, M) \ - ((__m128d) __builtin_ia32_dppd ((__v2df)(__m128d)(X), \ - (__v2df)(__m128d)(Y), (int)(M))) -#endif - -/* Packed integer 64-bit comparison, zeroing or filling with ones - corresponding parts of result. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpeq_epi64 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pcmpeqq ((__v2di)__X, (__v2di)__Y); -} - -/* Min/max packed integer instructions. */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_epi8 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pminsb128 ((__v16qi)__X, (__v16qi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_epi8 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pmaxsb128 ((__v16qi)__X, (__v16qi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_epu16 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pminuw128 ((__v8hi)__X, (__v8hi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_epu16 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pmaxuw128 ((__v8hi)__X, (__v8hi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_epi32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pminsd128 ((__v4si)__X, (__v4si)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_epi32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pmaxsd128 ((__v4si)__X, (__v4si)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_epu32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pminud128 ((__v4si)__X, (__v4si)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_epu32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pmaxud128 ((__v4si)__X, (__v4si)__Y); -} - -/* Packed integer 32-bit multiplication with truncation of upper - halves of results. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mullo_epi32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pmulld128 ((__v4si)__X, (__v4si)__Y); -} - -/* Packed integer 32-bit multiplication of 2 pairs of operands - with two 64-bit results. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mul_epi32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pmuldq128 ((__v4si)__X, (__v4si)__Y); -} - -/* Insert single precision float into packed single precision array - element selected by index N. The bits [7-6] of N define S - index, the bits [5-4] define D index, and bits [3-0] define - zeroing mask for D. */ - -#ifdef __OPTIMIZE__ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_insert_ps (__m128 __D, __m128 __S, const int __N) -{ - return (__m128) __builtin_ia32_insertps128 ((__v4sf)__D, - (__v4sf)__S, - __N); -} -#else -#define _mm_insert_ps(D, S, N) \ - ((__m128) __builtin_ia32_insertps128 ((__v4sf)(__m128)(D), \ - (__v4sf)(__m128)(S), (int)(N))) -#endif - -/* Helper macro to create the N value for _mm_insert_ps. */ -#define _MM_MK_INSERTPS_NDX(S, D, M) (((S) << 6) | ((D) << 4) | (M)) - -/* Extract binary representation of single precision float from packed - single precision array element of X selected by index N. */ - -#ifdef __OPTIMIZE__ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_extract_ps (__m128 __X, const int __N) -{ - union { int i; float f; } __tmp; - __tmp.f = __builtin_ia32_vec_ext_v4sf ((__v4sf)__X, __N); - return __tmp.i; -} -#else -#define _mm_extract_ps(X, N) \ - (__extension__ \ - ({ \ - union { int i; float f; } __tmp; \ - __tmp.f = __builtin_ia32_vec_ext_v4sf ((__v4sf)(__m128)(X), (int)(N)); \ - __tmp.i; \ - })) -#endif - -/* Extract binary representation of single precision float into - D from packed single precision array element of S selected - by index N. */ -#define _MM_EXTRACT_FLOAT(D, S, N) \ - { (D) = __builtin_ia32_vec_ext_v4sf ((__v4sf)(S), (N)); } - -/* Extract specified single precision float element into the lower - part of __m128. */ -#define _MM_PICK_OUT_PS(X, N) \ - _mm_insert_ps (_mm_setzero_ps (), (X), \ - _MM_MK_INSERTPS_NDX ((N), 0, 0x0e)) - -/* Insert integer, S, into packed integer array element of D - selected by index N. */ - -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_insert_epi8 (__m128i __D, int __S, const int __N) -{ - return (__m128i) __builtin_ia32_vec_set_v16qi ((__v16qi)__D, - __S, __N); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_insert_epi32 (__m128i __D, int __S, const int __N) -{ - return (__m128i) __builtin_ia32_vec_set_v4si ((__v4si)__D, - __S, __N); -} - -#ifdef __x86_64__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_insert_epi64 (__m128i __D, long long __S, const int __N) -{ - return (__m128i) __builtin_ia32_vec_set_v2di ((__v2di)__D, - __S, __N); -} -#endif -#else -#define _mm_insert_epi8(D, S, N) \ - ((__m128i) __builtin_ia32_vec_set_v16qi ((__v16qi)(__m128i)(D), \ - (int)(S), (int)(N))) - -#define _mm_insert_epi32(D, S, N) \ - ((__m128i) __builtin_ia32_vec_set_v4si ((__v4si)(__m128i)(D), \ - (int)(S), (int)(N))) - -#ifdef __x86_64__ -#define _mm_insert_epi64(D, S, N) \ - ((__m128i) __builtin_ia32_vec_set_v2di ((__v2di)(__m128i)(D), \ - (long long)(S), (int)(N))) -#endif -#endif - -/* Extract integer from packed integer array element of X selected by - index N. */ - -#ifdef __OPTIMIZE__ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_extract_epi8 (__m128i __X, const int __N) -{ - return (unsigned char) __builtin_ia32_vec_ext_v16qi ((__v16qi)__X, __N); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_extract_epi32 (__m128i __X, const int __N) -{ - return __builtin_ia32_vec_ext_v4si ((__v4si)__X, __N); -} - -#ifdef __x86_64__ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_extract_epi64 (__m128i __X, const int __N) -{ - return __builtin_ia32_vec_ext_v2di ((__v2di)__X, __N); -} -#endif -#else -#define _mm_extract_epi8(X, N) \ - ((int) (unsigned char) __builtin_ia32_vec_ext_v16qi ((__v16qi)(__m128i)(X), (int)(N))) -#define _mm_extract_epi32(X, N) \ - ((int) __builtin_ia32_vec_ext_v4si ((__v4si)(__m128i)(X), (int)(N))) - -#ifdef __x86_64__ -#define _mm_extract_epi64(X, N) \ - ((long long) __builtin_ia32_vec_ext_v2di ((__v2di)(__m128i)(X), (int)(N))) -#endif -#endif - -/* Return horizontal packed word minimum and its index in bits [15:0] - and bits [18:16] respectively. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_minpos_epu16 (__m128i __X) -{ - return (__m128i) __builtin_ia32_phminposuw128 ((__v8hi)__X); -} - -/* Packed integer sign-extension. */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepi8_epi32 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovsxbd128 ((__v16qi)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepi16_epi32 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovsxwd128 ((__v8hi)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepi8_epi64 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovsxbq128 ((__v16qi)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepi32_epi64 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovsxdq128 ((__v4si)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepi16_epi64 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovsxwq128 ((__v8hi)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepi8_epi16 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovsxbw128 ((__v16qi)__X); -} - -/* Packed integer zero-extension. */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepu8_epi32 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovzxbd128 ((__v16qi)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepu16_epi32 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovzxwd128 ((__v8hi)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepu8_epi64 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovzxbq128 ((__v16qi)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepu32_epi64 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovzxdq128 ((__v4si)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepu16_epi64 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovzxwq128 ((__v8hi)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtepu8_epi16 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pmovzxbw128 ((__v16qi)__X); -} - -/* Pack 8 double words from 2 operands into 8 words of result with - unsigned saturation. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_packus_epi32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_packusdw128 ((__v4si)__X, (__v4si)__Y); -} - -/* Sum absolute 8-bit integer difference of adjacent groups of 4 - byte integers in the first 2 operands. Starting offsets within - operands are determined by the 3rd mask operand. */ - -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mpsadbw_epu8 (__m128i __X, __m128i __Y, const int __M) -{ - return (__m128i) __builtin_ia32_mpsadbw128 ((__v16qi)__X, - (__v16qi)__Y, __M); -} -#else -#define _mm_mpsadbw_epu8(X, Y, M) \ - ((__m128i) __builtin_ia32_mpsadbw128 ((__v16qi)(__m128i)(X), \ - (__v16qi)(__m128i)(Y), (int)(M))) -#endif - -/* Load double quadword using non-temporal aligned hint. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_stream_load_si128 (__m128i *__X) -{ - return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); -} - -#ifdef __SSE4_2__ - -/* These macros specify the source data format. */ -#define _SIDD_UBYTE_OPS 0x00 -#define _SIDD_UWORD_OPS 0x01 -#define _SIDD_SBYTE_OPS 0x02 -#define _SIDD_SWORD_OPS 0x03 - -/* These macros specify the comparison operation. */ -#define _SIDD_CMP_EQUAL_ANY 0x00 -#define _SIDD_CMP_RANGES 0x04 -#define _SIDD_CMP_EQUAL_EACH 0x08 -#define _SIDD_CMP_EQUAL_ORDERED 0x0c - -/* These macros specify the polarity. */ -#define _SIDD_POSITIVE_POLARITY 0x00 -#define _SIDD_NEGATIVE_POLARITY 0x10 -#define _SIDD_MASKED_POSITIVE_POLARITY 0x20 -#define _SIDD_MASKED_NEGATIVE_POLARITY 0x30 - -/* These macros specify the output selection in _mm_cmpXstri (). */ -#define _SIDD_LEAST_SIGNIFICANT 0x00 -#define _SIDD_MOST_SIGNIFICANT 0x40 - -/* These macros specify the output selection in _mm_cmpXstrm (). */ -#define _SIDD_BIT_MASK 0x00 -#define _SIDD_UNIT_MASK 0x40 - -/* Intrinsics for text/string processing. */ - -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpistrm (__m128i __X, __m128i __Y, const int __M) -{ - return (__m128i) __builtin_ia32_pcmpistrm128 ((__v16qi)__X, - (__v16qi)__Y, - __M); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpistri (__m128i __X, __m128i __Y, const int __M) -{ - return __builtin_ia32_pcmpistri128 ((__v16qi)__X, - (__v16qi)__Y, - __M); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpestrm (__m128i __X, int __LX, __m128i __Y, int __LY, const int __M) -{ - return (__m128i) __builtin_ia32_pcmpestrm128 ((__v16qi)__X, __LX, - (__v16qi)__Y, __LY, - __M); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpestri (__m128i __X, int __LX, __m128i __Y, int __LY, const int __M) -{ - return __builtin_ia32_pcmpestri128 ((__v16qi)__X, __LX, - (__v16qi)__Y, __LY, - __M); -} -#else -#define _mm_cmpistrm(X, Y, M) \ - ((__m128i) __builtin_ia32_pcmpistrm128 ((__v16qi)(__m128i)(X), \ - (__v16qi)(__m128i)(Y), (int)(M))) -#define _mm_cmpistri(X, Y, M) \ - ((int) __builtin_ia32_pcmpistri128 ((__v16qi)(__m128i)(X), \ - (__v16qi)(__m128i)(Y), (int)(M))) - -#define _mm_cmpestrm(X, LX, Y, LY, M) \ - ((__m128i) __builtin_ia32_pcmpestrm128 ((__v16qi)(__m128i)(X), \ - (int)(LX), (__v16qi)(__m128i)(Y), \ - (int)(LY), (int)(M))) -#define _mm_cmpestri(X, LX, Y, LY, M) \ - ((int) __builtin_ia32_pcmpestri128 ((__v16qi)(__m128i)(X), (int)(LX), \ - (__v16qi)(__m128i)(Y), (int)(LY), \ - (int)(M))) -#endif - -/* Intrinsics for text/string processing and reading values of - EFlags. */ - -#ifdef __OPTIMIZE__ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpistra (__m128i __X, __m128i __Y, const int __M) -{ - return __builtin_ia32_pcmpistria128 ((__v16qi)__X, - (__v16qi)__Y, - __M); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpistrc (__m128i __X, __m128i __Y, const int __M) -{ - return __builtin_ia32_pcmpistric128 ((__v16qi)__X, - (__v16qi)__Y, - __M); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpistro (__m128i __X, __m128i __Y, const int __M) -{ - return __builtin_ia32_pcmpistrio128 ((__v16qi)__X, - (__v16qi)__Y, - __M); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpistrs (__m128i __X, __m128i __Y, const int __M) -{ - return __builtin_ia32_pcmpistris128 ((__v16qi)__X, - (__v16qi)__Y, - __M); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpistrz (__m128i __X, __m128i __Y, const int __M) -{ - return __builtin_ia32_pcmpistriz128 ((__v16qi)__X, - (__v16qi)__Y, - __M); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpestra (__m128i __X, int __LX, __m128i __Y, int __LY, const int __M) -{ - return __builtin_ia32_pcmpestria128 ((__v16qi)__X, __LX, - (__v16qi)__Y, __LY, - __M); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpestrc (__m128i __X, int __LX, __m128i __Y, int __LY, const int __M) -{ - return __builtin_ia32_pcmpestric128 ((__v16qi)__X, __LX, - (__v16qi)__Y, __LY, - __M); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpestro (__m128i __X, int __LX, __m128i __Y, int __LY, const int __M) -{ - return __builtin_ia32_pcmpestrio128 ((__v16qi)__X, __LX, - (__v16qi)__Y, __LY, - __M); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpestrs (__m128i __X, int __LX, __m128i __Y, int __LY, const int __M) -{ - return __builtin_ia32_pcmpestris128 ((__v16qi)__X, __LX, - (__v16qi)__Y, __LY, - __M); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpestrz (__m128i __X, int __LX, __m128i __Y, int __LY, const int __M) -{ - return __builtin_ia32_pcmpestriz128 ((__v16qi)__X, __LX, - (__v16qi)__Y, __LY, - __M); -} -#else -#define _mm_cmpistra(X, Y, M) \ - ((int) __builtin_ia32_pcmpistria128 ((__v16qi)(__m128i)(X), \ - (__v16qi)(__m128i)(Y), (int)(M))) -#define _mm_cmpistrc(X, Y, M) \ - ((int) __builtin_ia32_pcmpistric128 ((__v16qi)(__m128i)(X), \ - (__v16qi)(__m128i)(Y), (int)(M))) -#define _mm_cmpistro(X, Y, M) \ - ((int) __builtin_ia32_pcmpistrio128 ((__v16qi)(__m128i)(X), \ - (__v16qi)(__m128i)(Y), (int)(M))) -#define _mm_cmpistrs(X, Y, M) \ - ((int) __builtin_ia32_pcmpistris128 ((__v16qi)(__m128i)(X), \ - (__v16qi)(__m128i)(Y), (int)(M))) -#define _mm_cmpistrz(X, Y, M) \ - ((int) __builtin_ia32_pcmpistriz128 ((__v16qi)(__m128i)(X), \ - (__v16qi)(__m128i)(Y), (int)(M))) - -#define _mm_cmpestra(X, LX, Y, LY, M) \ - ((int) __builtin_ia32_pcmpestria128 ((__v16qi)(__m128i)(X), (int)(LX), \ - (__v16qi)(__m128i)(Y), (int)(LY), \ - (int)(M))) -#define _mm_cmpestrc(X, LX, Y, LY, M) \ - ((int) __builtin_ia32_pcmpestric128 ((__v16qi)(__m128i)(X), (int)(LX), \ - (__v16qi)(__m128i)(Y), (int)(LY), \ - (int)(M))) -#define _mm_cmpestro(X, LX, Y, LY, M) \ - ((int) __builtin_ia32_pcmpestrio128 ((__v16qi)(__m128i)(X), (int)(LX), \ - (__v16qi)(__m128i)(Y), (int)(LY), \ - (int)(M))) -#define _mm_cmpestrs(X, LX, Y, LY, M) \ - ((int) __builtin_ia32_pcmpestris128 ((__v16qi)(__m128i)(X), (int)(LX), \ - (__v16qi)(__m128i)(Y), (int)(LY), \ - (int)(M))) -#define _mm_cmpestrz(X, LX, Y, LY, M) \ - ((int) __builtin_ia32_pcmpestriz128 ((__v16qi)(__m128i)(X), (int)(LX), \ - (__v16qi)(__m128i)(Y), (int)(LY), \ - (int)(M))) -#endif - -/* Packed integer 64-bit comparison, zeroing or filling with ones - corresponding parts of result. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpgt_epi64 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pcmpgtq ((__v2di)__X, (__v2di)__Y); -} - -#ifdef __POPCNT__ -#include <popcntintrin.h> -#endif - -/* Accumulate CRC32 (polynomial 0x11EDC6F41) value. */ -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_crc32_u8 (unsigned int __C, unsigned char __V) -{ - return __builtin_ia32_crc32qi (__C, __V); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_crc32_u16 (unsigned int __C, unsigned short __V) -{ - return __builtin_ia32_crc32hi (__C, __V); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_crc32_u32 (unsigned int __C, unsigned int __V) -{ - return __builtin_ia32_crc32si (__C, __V); -} - -#ifdef __x86_64__ -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_crc32_u64 (unsigned long long __C, unsigned long long __V) -{ - return __builtin_ia32_crc32di (__C, __V); -} -#endif - -#endif /* __SSE4_2__ */ - -#endif /* __SSE4_1__ */ - -#endif /* _SMMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/stdalign.h b/lib/gcc/x86_64-linux-android/4.8/include/stdalign.h deleted file mode 100644 index fe545dd..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/stdalign.h +++ /dev/null @@ -1,39 +0,0 @@ -/* Copyright (C) 2011-2013 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3, or (at your option) -any later version. - -GCC is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. */ - -/* ISO C1X: 7.15 Alignment <stdalign.h>. */ - -#ifndef _STDALIGN_H -#define _STDALIGN_H - -#ifndef __cplusplus - -#define alignas _Alignas -#define alignof _Alignof - -#define __alignas_is_defined 1 -#define __alignof_is_defined 1 - -#endif - -#endif /* stdalign.h */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/stdarg.h b/lib/gcc/x86_64-linux-android/4.8/include/stdarg.h deleted file mode 100644 index fb4e0d6..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/stdarg.h +++ /dev/null @@ -1,126 +0,0 @@ -/* Copyright (C) 1989-2013 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3, or (at your option) -any later version. - -GCC is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. */ - -/* - * ISO C Standard: 7.15 Variable arguments <stdarg.h> - */ - -#ifndef _STDARG_H -#ifndef _ANSI_STDARG_H_ -#ifndef __need___va_list -#define _STDARG_H -#define _ANSI_STDARG_H_ -#endif /* not __need___va_list */ -#undef __need___va_list - -/* Define __gnuc_va_list. */ - -#ifndef __GNUC_VA_LIST -#define __GNUC_VA_LIST -typedef __builtin_va_list __gnuc_va_list; -#endif - -/* Define the standard macros for the user, - if this invocation was from the user program. */ -#ifdef _STDARG_H - -#define va_start(v,l) __builtin_va_start(v,l) -#define va_end(v) __builtin_va_end(v) -#define va_arg(v,l) __builtin_va_arg(v,l) -#if !defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L || defined(__GXX_EXPERIMENTAL_CXX0X__) -#define va_copy(d,s) __builtin_va_copy(d,s) -#endif -#define __va_copy(d,s) __builtin_va_copy(d,s) - -/* Define va_list, if desired, from __gnuc_va_list. */ -/* We deliberately do not define va_list when called from - stdio.h, because ANSI C says that stdio.h is not supposed to define - va_list. stdio.h needs to have access to that data type, - but must not use that name. It should use the name __gnuc_va_list, - which is safe because it is reserved for the implementation. */ - -#ifdef _BSD_VA_LIST -#undef _BSD_VA_LIST -#endif - -#if defined(__svr4__) || (defined(_SCO_DS) && !defined(__VA_LIST)) -/* SVR4.2 uses _VA_LIST for an internal alias for va_list, - so we must avoid testing it and setting it here. - SVR4 uses _VA_LIST as a flag in stdarg.h, but we should - have no conflict with that. */ -#ifndef _VA_LIST_ -#define _VA_LIST_ -#ifdef __i860__ -#ifndef _VA_LIST -#define _VA_LIST va_list -#endif -#endif /* __i860__ */ -typedef __gnuc_va_list va_list; -#ifdef _SCO_DS -#define __VA_LIST -#endif -#endif /* _VA_LIST_ */ -#else /* not __svr4__ || _SCO_DS */ - -/* The macro _VA_LIST_ is the same thing used by this file in Ultrix. - But on BSD NET2 we must not test or define or undef it. - (Note that the comments in NET 2's ansi.h - are incorrect for _VA_LIST_--see stdio.h!) */ -#if !defined (_VA_LIST_) || defined (__BSD_NET2__) || defined (____386BSD____) || defined (__bsdi__) || defined (__sequent__) || defined (__FreeBSD__) || defined(WINNT) -/* The macro _VA_LIST_DEFINED is used in Windows NT 3.5 */ -#ifndef _VA_LIST_DEFINED -/* The macro _VA_LIST is used in SCO Unix 3.2. */ -#ifndef _VA_LIST -/* The macro _VA_LIST_T_H is used in the Bull dpx2 */ -#ifndef _VA_LIST_T_H -/* The macro __va_list__ is used by BeOS. */ -#ifndef __va_list__ -typedef __gnuc_va_list va_list; -#endif /* not __va_list__ */ -#endif /* not _VA_LIST_T_H */ -#endif /* not _VA_LIST */ -#endif /* not _VA_LIST_DEFINED */ -#if !(defined (__BSD_NET2__) || defined (____386BSD____) || defined (__bsdi__) || defined (__sequent__) || defined (__FreeBSD__)) -#define _VA_LIST_ -#endif -#ifndef _VA_LIST -#define _VA_LIST -#endif -#ifndef _VA_LIST_DEFINED -#define _VA_LIST_DEFINED -#endif -#ifndef _VA_LIST_T_H -#define _VA_LIST_T_H -#endif -#ifndef __va_list__ -#define __va_list__ -#endif - -#endif /* not _VA_LIST_, except on certain systems */ - -#endif /* not __svr4__ */ - -#endif /* _STDARG_H */ - -#endif /* not _ANSI_STDARG_H_ */ -#endif /* not _STDARG_H */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/stdbool.h b/lib/gcc/x86_64-linux-android/4.8/include/stdbool.h deleted file mode 100644 index 7146e63..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/stdbool.h +++ /dev/null @@ -1,50 +0,0 @@ -/* Copyright (C) 1998-2013 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3, or (at your option) -any later version. - -GCC is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. */ - -/* - * ISO C Standard: 7.16 Boolean type and values <stdbool.h> - */ - -#ifndef _STDBOOL_H -#define _STDBOOL_H - -#ifndef __cplusplus - -#define bool _Bool -#define true 1 -#define false 0 - -#else /* __cplusplus */ - -/* Supporting <stdbool.h> in C++ is a GCC extension. */ -#define _Bool bool -#define bool bool -#define false false -#define true true - -#endif /* __cplusplus */ - -/* Signal that all the definitions are present. */ -#define __bool_true_false_are_defined 1 - -#endif /* stdbool.h */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/stddef.h b/lib/gcc/x86_64-linux-android/4.8/include/stddef.h deleted file mode 100644 index b04dd65..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/stddef.h +++ /dev/null @@ -1,439 +0,0 @@ -/* Copyright (C) 1989-2013 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3, or (at your option) -any later version. - -GCC is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. */ - -/* - * ISO C Standard: 7.17 Common definitions <stddef.h> - */ -#if (!defined(_STDDEF_H) && !defined(_STDDEF_H_) && !defined(_ANSI_STDDEF_H) \ - && !defined(__STDDEF_H__)) \ - || defined(__need_wchar_t) || defined(__need_size_t) \ - || defined(__need_ptrdiff_t) || defined(__need_NULL) \ - || defined(__need_wint_t) - -/* Any one of these symbols __need_* means that GNU libc - wants us just to define one data type. So don't define - the symbols that indicate this file's entire job has been done. */ -#if (!defined(__need_wchar_t) && !defined(__need_size_t) \ - && !defined(__need_ptrdiff_t) && !defined(__need_NULL) \ - && !defined(__need_wint_t)) -#define _STDDEF_H -#define _STDDEF_H_ -/* snaroff@next.com says the NeXT needs this. */ -#define _ANSI_STDDEF_H -#endif - -#ifndef __sys_stdtypes_h -/* This avoids lossage on SunOS but only if stdtypes.h comes first. - There's no way to win with the other order! Sun lossage. */ - -/* On 4.3bsd-net2, make sure ansi.h is included, so we have - one less case to deal with in the following. */ -#if defined (__BSD_NET2__) || defined (____386BSD____) || (defined (__FreeBSD__) && (__FreeBSD__ < 5)) || defined(__NetBSD__) -#include <machine/ansi.h> -#endif -/* On FreeBSD 5, machine/ansi.h does not exist anymore... */ -#if defined (__FreeBSD__) && (__FreeBSD__ >= 5) -#include <sys/_types.h> -#endif - -/* In 4.3bsd-net2, machine/ansi.h defines these symbols, which are - defined if the corresponding type is *not* defined. - FreeBSD-2.1 defines _MACHINE_ANSI_H_ instead of _ANSI_H_. - NetBSD defines _I386_ANSI_H_ and _X86_64_ANSI_H_ instead of _ANSI_H_ */ -#if defined(_ANSI_H_) || defined(_MACHINE_ANSI_H_) || defined(_X86_64_ANSI_H_) || defined(_I386_ANSI_H_) -#if !defined(_SIZE_T_) && !defined(_BSD_SIZE_T_) -#define _SIZE_T -#endif -#if !defined(_PTRDIFF_T_) && !defined(_BSD_PTRDIFF_T_) -#define _PTRDIFF_T -#endif -/* On BSD/386 1.1, at least, machine/ansi.h defines _BSD_WCHAR_T_ - instead of _WCHAR_T_. */ -#if !defined(_WCHAR_T_) && !defined(_BSD_WCHAR_T_) -#ifndef _BSD_WCHAR_T_ -#define _WCHAR_T -#endif -#endif -/* Undef _FOO_T_ if we are supposed to define foo_t. */ -#if defined (__need_ptrdiff_t) || defined (_STDDEF_H_) -#undef _PTRDIFF_T_ -#undef _BSD_PTRDIFF_T_ -#endif -#if defined (__need_size_t) || defined (_STDDEF_H_) -#undef _SIZE_T_ -#undef _BSD_SIZE_T_ -#endif -#if defined (__need_wchar_t) || defined (_STDDEF_H_) -#undef _WCHAR_T_ -#undef _BSD_WCHAR_T_ -#endif -#endif /* defined(_ANSI_H_) || defined(_MACHINE_ANSI_H_) || defined(_X86_64_ANSI_H_) || defined(_I386_ANSI_H_) */ - -/* Sequent's header files use _PTRDIFF_T_ in some conflicting way. - Just ignore it. */ -#if defined (__sequent__) && defined (_PTRDIFF_T_) -#undef _PTRDIFF_T_ -#endif - -/* On VxWorks, <type/vxTypesBase.h> may have defined macros like - _TYPE_size_t which will typedef size_t. fixincludes patched the - vxTypesBase.h so that this macro is only defined if _GCC_SIZE_T is - not defined, and so that defining this macro defines _GCC_SIZE_T. - If we find that the macros are still defined at this point, we must - invoke them so that the type is defined as expected. */ -#if defined (_TYPE_ptrdiff_t) && (defined (__need_ptrdiff_t) || defined (_STDDEF_H_)) -_TYPE_ptrdiff_t; -#undef _TYPE_ptrdiff_t -#endif -#if defined (_TYPE_size_t) && (defined (__need_size_t) || defined (_STDDEF_H_)) -_TYPE_size_t; -#undef _TYPE_size_t -#endif -#if defined (_TYPE_wchar_t) && (defined (__need_wchar_t) || defined (_STDDEF_H_)) -_TYPE_wchar_t; -#undef _TYPE_wchar_t -#endif - -/* In case nobody has defined these types, but we aren't running under - GCC 2.00, make sure that __PTRDIFF_TYPE__, __SIZE_TYPE__, and - __WCHAR_TYPE__ have reasonable values. This can happen if the - parts of GCC is compiled by an older compiler, that actually - include gstddef.h, such as collect2. */ - -/* Signed type of difference of two pointers. */ - -/* Define this type if we are doing the whole job, - or if we want this type in particular. */ -#if defined (_STDDEF_H) || defined (__need_ptrdiff_t) -#ifndef _PTRDIFF_T /* in case <sys/types.h> has defined it. */ -#ifndef _T_PTRDIFF_ -#ifndef _T_PTRDIFF -#ifndef __PTRDIFF_T -#ifndef _PTRDIFF_T_ -#ifndef _BSD_PTRDIFF_T_ -#ifndef ___int_ptrdiff_t_h -#ifndef _GCC_PTRDIFF_T -#define _PTRDIFF_T -#define _T_PTRDIFF_ -#define _T_PTRDIFF -#define __PTRDIFF_T -#define _PTRDIFF_T_ -#define _BSD_PTRDIFF_T_ -#define ___int_ptrdiff_t_h -#define _GCC_PTRDIFF_T -#ifndef __PTRDIFF_TYPE__ -#define __PTRDIFF_TYPE__ long int -#endif -typedef __PTRDIFF_TYPE__ ptrdiff_t; -#endif /* _GCC_PTRDIFF_T */ -#endif /* ___int_ptrdiff_t_h */ -#endif /* _BSD_PTRDIFF_T_ */ -#endif /* _PTRDIFF_T_ */ -#endif /* __PTRDIFF_T */ -#endif /* _T_PTRDIFF */ -#endif /* _T_PTRDIFF_ */ -#endif /* _PTRDIFF_T */ - -/* If this symbol has done its job, get rid of it. */ -#undef __need_ptrdiff_t - -#endif /* _STDDEF_H or __need_ptrdiff_t. */ - -/* Unsigned type of `sizeof' something. */ - -/* Define this type if we are doing the whole job, - or if we want this type in particular. */ -#if defined (_STDDEF_H) || defined (__need_size_t) -#ifndef __size_t__ /* BeOS */ -#ifndef __SIZE_T__ /* Cray Unicos/Mk */ -#ifndef _SIZE_T /* in case <sys/types.h> has defined it. */ -#ifndef _SYS_SIZE_T_H -#ifndef _T_SIZE_ -#ifndef _T_SIZE -#ifndef __SIZE_T -#ifndef _SIZE_T_ -#ifndef _BSD_SIZE_T_ -#ifndef _SIZE_T_DEFINED_ -#ifndef _SIZE_T_DEFINED -#ifndef _BSD_SIZE_T_DEFINED_ /* Darwin */ -#ifndef _SIZE_T_DECLARED /* FreeBSD 5 */ -#ifndef ___int_size_t_h -#ifndef _GCC_SIZE_T -#ifndef _SIZET_ -#ifndef __size_t -#define __size_t__ /* BeOS */ -#define __SIZE_T__ /* Cray Unicos/Mk */ -#define _SIZE_T -#define _SYS_SIZE_T_H -#define _T_SIZE_ -#define _T_SIZE -#define __SIZE_T -#define _SIZE_T_ -#define _BSD_SIZE_T_ -#define _SIZE_T_DEFINED_ -#define _SIZE_T_DEFINED -#define _BSD_SIZE_T_DEFINED_ /* Darwin */ -#define _SIZE_T_DECLARED /* FreeBSD 5 */ -#define ___int_size_t_h -#define _GCC_SIZE_T -#define _SIZET_ -#if (defined (__FreeBSD__) && (__FreeBSD__ >= 5)) \ - || defined(__FreeBSD_kernel__) -/* __size_t is a typedef on FreeBSD 5, must not trash it. */ -#elif defined (__VMS__) -/* __size_t is also a typedef on VMS. */ -#else -#define __size_t -#endif -#ifndef __SIZE_TYPE__ -#define __SIZE_TYPE__ long unsigned int -#endif -#if !(defined (__GNUG__) && defined (size_t)) -typedef __SIZE_TYPE__ size_t; -#ifdef __BEOS__ -typedef long ssize_t; -#endif /* __BEOS__ */ -#endif /* !(defined (__GNUG__) && defined (size_t)) */ -#endif /* __size_t */ -#endif /* _SIZET_ */ -#endif /* _GCC_SIZE_T */ -#endif /* ___int_size_t_h */ -#endif /* _SIZE_T_DECLARED */ -#endif /* _BSD_SIZE_T_DEFINED_ */ -#endif /* _SIZE_T_DEFINED */ -#endif /* _SIZE_T_DEFINED_ */ -#endif /* _BSD_SIZE_T_ */ -#endif /* _SIZE_T_ */ -#endif /* __SIZE_T */ -#endif /* _T_SIZE */ -#endif /* _T_SIZE_ */ -#endif /* _SYS_SIZE_T_H */ -#endif /* _SIZE_T */ -#endif /* __SIZE_T__ */ -#endif /* __size_t__ */ -#undef __need_size_t -#endif /* _STDDEF_H or __need_size_t. */ - - -/* Wide character type. - Locale-writers should change this as necessary to - be big enough to hold unique values not between 0 and 127, - and not (wchar_t) -1, for each defined multibyte character. */ - -/* Define this type if we are doing the whole job, - or if we want this type in particular. */ -#if defined (_STDDEF_H) || defined (__need_wchar_t) -#ifndef __wchar_t__ /* BeOS */ -#ifndef __WCHAR_T__ /* Cray Unicos/Mk */ -#ifndef _WCHAR_T -#ifndef _T_WCHAR_ -#ifndef _T_WCHAR -#ifndef __WCHAR_T -#ifndef _WCHAR_T_ -#ifndef _BSD_WCHAR_T_ -#ifndef _BSD_WCHAR_T_DEFINED_ /* Darwin */ -#ifndef _BSD_RUNE_T_DEFINED_ /* Darwin */ -#ifndef _WCHAR_T_DECLARED /* FreeBSD 5 */ -#ifndef _WCHAR_T_DEFINED_ -#ifndef _WCHAR_T_DEFINED -#ifndef _WCHAR_T_H -#ifndef ___int_wchar_t_h -#ifndef __INT_WCHAR_T_H -#ifndef _GCC_WCHAR_T -#define __wchar_t__ /* BeOS */ -#define __WCHAR_T__ /* Cray Unicos/Mk */ -#define _WCHAR_T -#define _T_WCHAR_ -#define _T_WCHAR -#define __WCHAR_T -#define _WCHAR_T_ -#define _BSD_WCHAR_T_ -#define _WCHAR_T_DEFINED_ -#define _WCHAR_T_DEFINED -#define _WCHAR_T_H -#define ___int_wchar_t_h -#define __INT_WCHAR_T_H -#define _GCC_WCHAR_T -#define _WCHAR_T_DECLARED - -/* On BSD/386 1.1, at least, machine/ansi.h defines _BSD_WCHAR_T_ - instead of _WCHAR_T_, and _BSD_RUNE_T_ (which, unlike the other - symbols in the _FOO_T_ family, stays defined even after its - corresponding type is defined). If we define wchar_t, then we - must undef _WCHAR_T_; for BSD/386 1.1 (and perhaps others), if - we undef _WCHAR_T_, then we must also define rune_t, since - headers like runetype.h assume that if machine/ansi.h is included, - and _BSD_WCHAR_T_ is not defined, then rune_t is available. - machine/ansi.h says, "Note that _WCHAR_T_ and _RUNE_T_ must be of - the same type." */ -#ifdef _BSD_WCHAR_T_ -#undef _BSD_WCHAR_T_ -#ifdef _BSD_RUNE_T_ -#if !defined (_ANSI_SOURCE) && !defined (_POSIX_SOURCE) -typedef _BSD_RUNE_T_ rune_t; -#define _BSD_WCHAR_T_DEFINED_ -#define _BSD_RUNE_T_DEFINED_ /* Darwin */ -#if defined (__FreeBSD__) && (__FreeBSD__ < 5) -/* Why is this file so hard to maintain properly? In contrast to - the comment above regarding BSD/386 1.1, on FreeBSD for as long - as the symbol has existed, _BSD_RUNE_T_ must not stay defined or - redundant typedefs will occur when stdlib.h is included after this file. */ -#undef _BSD_RUNE_T_ -#endif -#endif -#endif -#endif -/* FreeBSD 5 can't be handled well using "traditional" logic above - since it no longer defines _BSD_RUNE_T_ yet still desires to export - rune_t in some cases... */ -#if defined (__FreeBSD__) && (__FreeBSD__ >= 5) -#if !defined (_ANSI_SOURCE) && !defined (_POSIX_SOURCE) -#if __BSD_VISIBLE -#ifndef _RUNE_T_DECLARED -typedef __rune_t rune_t; -#define _RUNE_T_DECLARED -#endif -#endif -#endif -#endif - -#ifndef __WCHAR_TYPE__ -#define __WCHAR_TYPE__ int -#endif -#ifndef __cplusplus -typedef __WCHAR_TYPE__ wchar_t; -#endif -#endif -#endif -#endif -#endif -#endif -#endif -#endif /* _WCHAR_T_DECLARED */ -#endif /* _BSD_RUNE_T_DEFINED_ */ -#endif -#endif -#endif -#endif -#endif -#endif -#endif -#endif /* __WCHAR_T__ */ -#endif /* __wchar_t__ */ -#undef __need_wchar_t -#endif /* _STDDEF_H or __need_wchar_t. */ - -#if defined (__need_wint_t) -#ifndef _WINT_T -#define _WINT_T - -#ifndef __WINT_TYPE__ -#define __WINT_TYPE__ unsigned int -#endif -typedef __WINT_TYPE__ wint_t; -#endif -#undef __need_wint_t -#endif - -/* In 4.3bsd-net2, leave these undefined to indicate that size_t, etc. - are already defined. */ -/* BSD/OS 3.1 and FreeBSD [23].x require the MACHINE_ANSI_H check here. */ -/* NetBSD 5 requires the I386_ANSI_H and X86_64_ANSI_H checks here. */ -#if defined(_ANSI_H_) || defined(_MACHINE_ANSI_H_) || defined(_X86_64_ANSI_H_) || defined(_I386_ANSI_H_) -/* The references to _GCC_PTRDIFF_T_, _GCC_SIZE_T_, and _GCC_WCHAR_T_ - are probably typos and should be removed before 2.8 is released. */ -#ifdef _GCC_PTRDIFF_T_ -#undef _PTRDIFF_T_ -#undef _BSD_PTRDIFF_T_ -#endif -#ifdef _GCC_SIZE_T_ -#undef _SIZE_T_ -#undef _BSD_SIZE_T_ -#endif -#ifdef _GCC_WCHAR_T_ -#undef _WCHAR_T_ -#undef _BSD_WCHAR_T_ -#endif -/* The following ones are the real ones. */ -#ifdef _GCC_PTRDIFF_T -#undef _PTRDIFF_T_ -#undef _BSD_PTRDIFF_T_ -#endif -#ifdef _GCC_SIZE_T -#undef _SIZE_T_ -#undef _BSD_SIZE_T_ -#endif -#ifdef _GCC_WCHAR_T -#undef _WCHAR_T_ -#undef _BSD_WCHAR_T_ -#endif -#endif /* _ANSI_H_ || _MACHINE_ANSI_H_ || _X86_64_ANSI_H_ || _I386_ANSI_H_ */ - -#endif /* __sys_stdtypes_h */ - -/* A null pointer constant. */ - -#if defined (_STDDEF_H) || defined (__need_NULL) -#undef NULL /* in case <stdio.h> has defined it. */ -#ifdef __GNUG__ -#define NULL __null -#else /* G++ */ -#ifndef __cplusplus -#define NULL ((void *)0) -#else /* C++ */ -#define NULL 0 -#endif /* C++ */ -#endif /* G++ */ -#endif /* NULL not defined and <stddef.h> or need NULL. */ -#undef __need_NULL - -#ifdef _STDDEF_H - -/* Offset of member MEMBER in a struct of type TYPE. */ -#define offsetof(TYPE, MEMBER) __builtin_offsetof (TYPE, MEMBER) - -#if (defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L) \ - || (defined(__cplusplus) && __cplusplus >= 201103L) -#ifndef _GCC_MAX_ALIGN_T -#define _GCC_MAX_ALIGN_T -/* Type whose alignment is supported in every context and is at least - as great as that of any standard type not using alignment - specifiers. */ -typedef struct { - long long __max_align_ll __attribute__((__aligned__(__alignof__(long long)))); - long double __max_align_ld __attribute__((__aligned__(__alignof__(long double)))); -} max_align_t; -#endif -#endif /* C11 or C++11. */ - -#if defined(__cplusplus) && __cplusplus >= 201103L -#ifndef _GXX_NULLPTR_T -#define _GXX_NULLPTR_T - typedef decltype(nullptr) nullptr_t; -#endif -#endif /* C++11. */ - -#endif /* _STDDEF_H was defined this time */ - -#endif /* !_STDDEF_H && !_STDDEF_H_ && !_ANSI_STDDEF_H && !__STDDEF_H__ - || __need_XXX was not defined before */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/stdfix.h b/lib/gcc/x86_64-linux-android/4.8/include/stdfix.h deleted file mode 100644 index fdcef1e..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/stdfix.h +++ /dev/null @@ -1,204 +0,0 @@ -/* Copyright (C) 2007-2013 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3, or (at your option) -any later version. - -GCC is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. */ - -/* ISO/IEC JTC1 SC22 WG14 N1169 - * Date: 2006-04-04 - * ISO/IEC TR 18037 - * Programming languages - C - Extensions to support embedded processors - */ - -#ifndef _STDFIX_H -#define _STDFIX_H - -/* 7.18a.1 Introduction. */ - -#undef fract -#undef accum -#undef sat -#define fract _Fract -#define accum _Accum -#define sat _Sat - -/* 7.18a.3 Precision macros. */ - -#undef SFRACT_FBIT -#undef SFRACT_MIN -#undef SFRACT_MAX -#undef SFRACT_EPSILON -#define SFRACT_FBIT __SFRACT_FBIT__ -#define SFRACT_MIN __SFRACT_MIN__ -#define SFRACT_MAX __SFRACT_MAX__ -#define SFRACT_EPSILON __SFRACT_EPSILON__ - -#undef USFRACT_FBIT -#undef USFRACT_MIN -#undef USFRACT_MAX -#undef USFRACT_EPSILON -#define USFRACT_FBIT __USFRACT_FBIT__ -#define USFRACT_MIN __USFRACT_MIN__ /* GCC extension. */ -#define USFRACT_MAX __USFRACT_MAX__ -#define USFRACT_EPSILON __USFRACT_EPSILON__ - -#undef FRACT_FBIT -#undef FRACT_MIN -#undef FRACT_MAX -#undef FRACT_EPSILON -#define FRACT_FBIT __FRACT_FBIT__ -#define FRACT_MIN __FRACT_MIN__ -#define FRACT_MAX __FRACT_MAX__ -#define FRACT_EPSILON __FRACT_EPSILON__ - -#undef UFRACT_FBIT -#undef UFRACT_MIN -#undef UFRACT_MAX -#undef UFRACT_EPSILON -#define UFRACT_FBIT __UFRACT_FBIT__ -#define UFRACT_MIN __UFRACT_MIN__ /* GCC extension. */ -#define UFRACT_MAX __UFRACT_MAX__ -#define UFRACT_EPSILON __UFRACT_EPSILON__ - -#undef LFRACT_FBIT -#undef LFRACT_MIN -#undef LFRACT_MAX -#undef LFRACT_EPSILON -#define LFRACT_FBIT __LFRACT_FBIT__ -#define LFRACT_MIN __LFRACT_MIN__ -#define LFRACT_MAX __LFRACT_MAX__ -#define LFRACT_EPSILON __LFRACT_EPSILON__ - -#undef ULFRACT_FBIT -#undef ULFRACT_MIN -#undef ULFRACT_MAX -#undef ULFRACT_EPSILON -#define ULFRACT_FBIT __ULFRACT_FBIT__ -#define ULFRACT_MIN __ULFRACT_MIN__ /* GCC extension. */ -#define ULFRACT_MAX __ULFRACT_MAX__ -#define ULFRACT_EPSILON __ULFRACT_EPSILON__ - -#undef LLFRACT_FBIT -#undef LLFRACT_MIN -#undef LLFRACT_MAX -#undef LLFRACT_EPSILON -#define LLFRACT_FBIT __LLFRACT_FBIT__ /* GCC extension. */ -#define LLFRACT_MIN __LLFRACT_MIN__ /* GCC extension. */ -#define LLFRACT_MAX __LLFRACT_MAX__ /* GCC extension. */ -#define LLFRACT_EPSILON __LLFRACT_EPSILON__ /* GCC extension. */ - -#undef ULLFRACT_FBIT -#undef ULLFRACT_MIN -#undef ULLFRACT_MAX -#undef ULLFRACT_EPSILON -#define ULLFRACT_FBIT __ULLFRACT_FBIT__ /* GCC extension. */ -#define ULLFRACT_MIN __ULLFRACT_MIN__ /* GCC extension. */ -#define ULLFRACT_MAX __ULLFRACT_MAX__ /* GCC extension. */ -#define ULLFRACT_EPSILON __ULLFRACT_EPSILON__ /* GCC extension. */ - -#undef SACCUM_FBIT -#undef SACCUM_IBIT -#undef SACCUM_MIN -#undef SACCUM_MAX -#undef SACCUM_EPSILON -#define SACCUM_FBIT __SACCUM_FBIT__ -#define SACCUM_IBIT __SACCUM_IBIT__ -#define SACCUM_MIN __SACCUM_MIN__ -#define SACCUM_MAX __SACCUM_MAX__ -#define SACCUM_EPSILON __SACCUM_EPSILON__ - -#undef USACCUM_FBIT -#undef USACCUM_IBIT -#undef USACCUM_MIN -#undef USACCUM_MAX -#undef USACCUM_EPSILON -#define USACCUM_FBIT __USACCUM_FBIT__ -#define USACCUM_IBIT __USACCUM_IBIT__ -#define USACCUM_MIN __USACCUM_MIN__ /* GCC extension. */ -#define USACCUM_MAX __USACCUM_MAX__ -#define USACCUM_EPSILON __USACCUM_EPSILON__ - -#undef ACCUM_FBIT -#undef ACCUM_IBIT -#undef ACCUM_MIN -#undef ACCUM_MAX -#undef ACCUM_EPSILON -#define ACCUM_FBIT __ACCUM_FBIT__ -#define ACCUM_IBIT __ACCUM_IBIT__ -#define ACCUM_MIN __ACCUM_MIN__ -#define ACCUM_MAX __ACCUM_MAX__ -#define ACCUM_EPSILON __ACCUM_EPSILON__ - -#undef UACCUM_FBIT -#undef UACCUM_IBIT -#undef UACCUM_MIN -#undef UACCUM_MAX -#undef UACCUM_EPSILON -#define UACCUM_FBIT __UACCUM_FBIT__ -#define UACCUM_IBIT __UACCUM_IBIT__ -#define UACCUM_MIN __UACCUM_MIN__ /* GCC extension. */ -#define UACCUM_MAX __UACCUM_MAX__ -#define UACCUM_EPSILON __UACCUM_EPSILON__ - -#undef LACCUM_FBIT -#undef LACCUM_IBIT -#undef LACCUM_MIN -#undef LACCUM_MAX -#undef LACCUM_EPSILON -#define LACCUM_FBIT __LACCUM_FBIT__ -#define LACCUM_IBIT __LACCUM_IBIT__ -#define LACCUM_MIN __LACCUM_MIN__ -#define LACCUM_MAX __LACCUM_MAX__ -#define LACCUM_EPSILON __LACCUM_EPSILON__ - -#undef ULACCUM_FBIT -#undef ULACCUM_IBIT -#undef ULACCUM_MIN -#undef ULACCUM_MAX -#undef ULACCUM_EPSILON -#define ULACCUM_FBIT __ULACCUM_FBIT__ -#define ULACCUM_IBIT __ULACCUM_IBIT__ -#define ULACCUM_MIN __ULACCUM_MIN__ /* GCC extension. */ -#define ULACCUM_MAX __ULACCUM_MAX__ -#define ULACCUM_EPSILON __ULACCUM_EPSILON__ - -#undef LLACCUM_FBIT -#undef LLACCUM_IBIT -#undef LLACCUM_MIN -#undef LLACCUM_MAX -#undef LLACCUM_EPSILON -#define LLACCUM_FBIT __LLACCUM_FBIT__ /* GCC extension. */ -#define LLACCUM_IBIT __LLACCUM_IBIT__ /* GCC extension. */ -#define LLACCUM_MIN __LLACCUM_MIN__ /* GCC extension. */ -#define LLACCUM_MAX __LLACCUM_MAX__ /* GCC extension. */ -#define LLACCUM_EPSILON __LLACCUM_EPSILON__ /* GCC extension. */ - -#undef ULLACCUM_FBIT -#undef ULLACCUM_IBIT -#undef ULLACCUM_MIN -#undef ULLACCUM_MAX -#undef ULLACCUM_EPSILON -#define ULLACCUM_FBIT __ULLACCUM_FBIT__ /* GCC extension. */ -#define ULLACCUM_IBIT __ULLACCUM_IBIT__ /* GCC extension. */ -#define ULLACCUM_MIN __ULLACCUM_MIN__ /* GCC extension. */ -#define ULLACCUM_MAX __ULLACCUM_MAX__ /* GCC extension. */ -#define ULLACCUM_EPSILON __ULLACCUM_EPSILON__ /* GCC extension. */ - -#endif /* _STDFIX_H */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/stdint-gcc.h b/lib/gcc/x86_64-linux-android/4.8/include/stdint-gcc.h deleted file mode 100644 index 97339e2..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/stdint-gcc.h +++ /dev/null @@ -1,263 +0,0 @@ -/* Copyright (C) 2008-2013 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3, or (at your option) -any later version. - -GCC is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. */ - -/* - * ISO C Standard: 7.18 Integer types <stdint.h> - */ - -#ifndef _GCC_STDINT_H -#define _GCC_STDINT_H - -/* 7.8.1.1 Exact-width integer types */ - -#ifdef __INT8_TYPE__ -typedef __INT8_TYPE__ int8_t; -#endif -#ifdef __INT16_TYPE__ -typedef __INT16_TYPE__ int16_t; -#endif -#ifdef __INT32_TYPE__ -typedef __INT32_TYPE__ int32_t; -#endif -#ifdef __INT64_TYPE__ -typedef __INT64_TYPE__ int64_t; -#endif -#ifdef __UINT8_TYPE__ -typedef __UINT8_TYPE__ uint8_t; -#endif -#ifdef __UINT16_TYPE__ -typedef __UINT16_TYPE__ uint16_t; -#endif -#ifdef __UINT32_TYPE__ -typedef __UINT32_TYPE__ uint32_t; -#endif -#ifdef __UINT64_TYPE__ -typedef __UINT64_TYPE__ uint64_t; -#endif - -/* 7.8.1.2 Minimum-width integer types */ - -typedef __INT_LEAST8_TYPE__ int_least8_t; -typedef __INT_LEAST16_TYPE__ int_least16_t; -typedef __INT_LEAST32_TYPE__ int_least32_t; -typedef __INT_LEAST64_TYPE__ int_least64_t; -typedef __UINT_LEAST8_TYPE__ uint_least8_t; -typedef __UINT_LEAST16_TYPE__ uint_least16_t; -typedef __UINT_LEAST32_TYPE__ uint_least32_t; -typedef __UINT_LEAST64_TYPE__ uint_least64_t; - -/* 7.8.1.3 Fastest minimum-width integer types */ - -typedef __INT_FAST8_TYPE__ int_fast8_t; -typedef __INT_FAST16_TYPE__ int_fast16_t; -typedef __INT_FAST32_TYPE__ int_fast32_t; -typedef __INT_FAST64_TYPE__ int_fast64_t; -typedef __UINT_FAST8_TYPE__ uint_fast8_t; -typedef __UINT_FAST16_TYPE__ uint_fast16_t; -typedef __UINT_FAST32_TYPE__ uint_fast32_t; -typedef __UINT_FAST64_TYPE__ uint_fast64_t; - -/* 7.8.1.4 Integer types capable of holding object pointers */ - -#ifdef __INTPTR_TYPE__ -typedef __INTPTR_TYPE__ intptr_t; -#endif -#ifdef __UINTPTR_TYPE__ -typedef __UINTPTR_TYPE__ uintptr_t; -#endif - -/* 7.8.1.5 Greatest-width integer types */ - -typedef __INTMAX_TYPE__ intmax_t; -typedef __UINTMAX_TYPE__ uintmax_t; - -#if (!defined __cplusplus || __cplusplus >= 201103L \ - || defined __STDC_LIMIT_MACROS) - -/* 7.18.2 Limits of specified-width integer types */ - -#ifdef __INT8_MAX__ -# undef INT8_MAX -# define INT8_MAX __INT8_MAX__ -# undef INT8_MIN -# define INT8_MIN (-INT8_MAX - 1) -#endif -#ifdef __UINT8_MAX__ -# undef UINT8_MAX -# define UINT8_MAX __UINT8_MAX__ -#endif -#ifdef __INT16_MAX__ -# undef INT16_MAX -# define INT16_MAX __INT16_MAX__ -# undef INT16_MIN -# define INT16_MIN (-INT16_MAX - 1) -#endif -#ifdef __UINT16_MAX__ -# undef UINT16_MAX -# define UINT16_MAX __UINT16_MAX__ -#endif -#ifdef __INT32_MAX__ -# undef INT32_MAX -# define INT32_MAX __INT32_MAX__ -# undef INT32_MIN -# define INT32_MIN (-INT32_MAX - 1) -#endif -#ifdef __UINT32_MAX__ -# undef UINT32_MAX -# define UINT32_MAX __UINT32_MAX__ -#endif -#ifdef __INT64_MAX__ -# undef INT64_MAX -# define INT64_MAX __INT64_MAX__ -# undef INT64_MIN -# define INT64_MIN (-INT64_MAX - 1) -#endif -#ifdef __UINT64_MAX__ -# undef UINT64_MAX -# define UINT64_MAX __UINT64_MAX__ -#endif - -#undef INT_LEAST8_MAX -#define INT_LEAST8_MAX __INT_LEAST8_MAX__ -#undef INT_LEAST8_MIN -#define INT_LEAST8_MIN (-INT_LEAST8_MAX - 1) -#undef UINT_LEAST8_MAX -#define UINT_LEAST8_MAX __UINT_LEAST8_MAX__ -#undef INT_LEAST16_MAX -#define INT_LEAST16_MAX __INT_LEAST16_MAX__ -#undef INT_LEAST16_MIN -#define INT_LEAST16_MIN (-INT_LEAST16_MAX - 1) -#undef UINT_LEAST16_MAX -#define UINT_LEAST16_MAX __UINT_LEAST16_MAX__ -#undef INT_LEAST32_MAX -#define INT_LEAST32_MAX __INT_LEAST32_MAX__ -#undef INT_LEAST32_MIN -#define INT_LEAST32_MIN (-INT_LEAST32_MAX - 1) -#undef UINT_LEAST32_MAX -#define UINT_LEAST32_MAX __UINT_LEAST32_MAX__ -#undef INT_LEAST64_MAX -#define INT_LEAST64_MAX __INT_LEAST64_MAX__ -#undef INT_LEAST64_MIN -#define INT_LEAST64_MIN (-INT_LEAST64_MAX - 1) -#undef UINT_LEAST64_MAX -#define UINT_LEAST64_MAX __UINT_LEAST64_MAX__ - -#undef INT_FAST8_MAX -#define INT_FAST8_MAX __INT_FAST8_MAX__ -#undef INT_FAST8_MIN -#define INT_FAST8_MIN (-INT_FAST8_MAX - 1) -#undef UINT_FAST8_MAX -#define UINT_FAST8_MAX __UINT_FAST8_MAX__ -#undef INT_FAST16_MAX -#define INT_FAST16_MAX __INT_FAST16_MAX__ -#undef INT_FAST16_MIN -#define INT_FAST16_MIN (-INT_FAST16_MAX - 1) -#undef UINT_FAST16_MAX -#define UINT_FAST16_MAX __UINT_FAST16_MAX__ -#undef INT_FAST32_MAX -#define INT_FAST32_MAX __INT_FAST32_MAX__ -#undef INT_FAST32_MIN -#define INT_FAST32_MIN (-INT_FAST32_MAX - 1) -#undef UINT_FAST32_MAX -#define UINT_FAST32_MAX __UINT_FAST32_MAX__ -#undef INT_FAST64_MAX -#define INT_FAST64_MAX __INT_FAST64_MAX__ -#undef INT_FAST64_MIN -#define INT_FAST64_MIN (-INT_FAST64_MAX - 1) -#undef UINT_FAST64_MAX -#define UINT_FAST64_MAX __UINT_FAST64_MAX__ - -#ifdef __INTPTR_MAX__ -# undef INTPTR_MAX -# define INTPTR_MAX __INTPTR_MAX__ -# undef INTPTR_MIN -# define INTPTR_MIN (-INTPTR_MAX - 1) -#endif -#ifdef __UINTPTR_MAX__ -# undef UINTPTR_MAX -# define UINTPTR_MAX __UINTPTR_MAX__ -#endif - -#undef INTMAX_MAX -#define INTMAX_MAX __INTMAX_MAX__ -#undef INTMAX_MIN -#define INTMAX_MIN (-INTMAX_MAX - 1) -#undef UINTMAX_MAX -#define UINTMAX_MAX __UINTMAX_MAX__ - -/* 7.18.3 Limits of other integer types */ - -#undef PTRDIFF_MAX -#define PTRDIFF_MAX __PTRDIFF_MAX__ -#undef PTRDIFF_MIN -#define PTRDIFF_MIN (-PTRDIFF_MAX - 1) - -#undef SIG_ATOMIC_MAX -#define SIG_ATOMIC_MAX __SIG_ATOMIC_MAX__ -#undef SIG_ATOMIC_MIN -#define SIG_ATOMIC_MIN __SIG_ATOMIC_MIN__ - -#undef SIZE_MAX -#define SIZE_MAX __SIZE_MAX__ - -#undef WCHAR_MAX -#define WCHAR_MAX __WCHAR_MAX__ -#undef WCHAR_MIN -#define WCHAR_MIN __WCHAR_MIN__ - -#undef WINT_MAX -#define WINT_MAX __WINT_MAX__ -#undef WINT_MIN -#define WINT_MIN __WINT_MIN__ - -#endif /* (!defined __cplusplus || __cplusplus >= 201103L - || defined __STDC_LIMIT_MACROS) */ - -#if (!defined __cplusplus || __cplusplus >= 201103L \ - || defined __STDC_CONSTANT_MACROS) - -#undef INT8_C -#define INT8_C(c) __INT8_C(c) -#undef INT16_C -#define INT16_C(c) __INT16_C(c) -#undef INT32_C -#define INT32_C(c) __INT32_C(c) -#undef INT64_C -#define INT64_C(c) __INT64_C(c) -#undef UINT8_C -#define UINT8_C(c) __UINT8_C(c) -#undef UINT16_C -#define UINT16_C(c) __UINT16_C(c) -#undef UINT32_C -#define UINT32_C(c) __UINT32_C(c) -#undef UINT64_C -#define UINT64_C(c) __UINT64_C(c) -#undef INTMAX_C -#define INTMAX_C(c) __INTMAX_C(c) -#undef UINTMAX_C -#define UINTMAX_C(c) __UINTMAX_C(c) - -#endif /* (!defined __cplusplus || __cplusplus >= 201103L - || defined __STDC_CONSTANT_MACROS) */ - -#endif /* _GCC_STDINT_H */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/stdint.h b/lib/gcc/x86_64-linux-android/4.8/include/stdint.h deleted file mode 100644 index 83b6f70..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/stdint.h +++ /dev/null @@ -1,14 +0,0 @@ -#ifndef _GCC_WRAP_STDINT_H -#if __STDC_HOSTED__ -# if defined __cplusplus && __cplusplus >= 201103L -# undef __STDC_LIMIT_MACROS -# define __STDC_LIMIT_MACROS -# undef __STDC_CONSTANT_MACROS -# define __STDC_CONSTANT_MACROS -# endif -# include_next <stdint.h> -#else -# include "stdint-gcc.h" -#endif -#define _GCC_WRAP_STDINT_H -#endif diff --git a/lib/gcc/x86_64-linux-android/4.8/include/stdnoreturn.h b/lib/gcc/x86_64-linux-android/4.8/include/stdnoreturn.h deleted file mode 100644 index ce4bec9..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/stdnoreturn.h +++ /dev/null @@ -1,35 +0,0 @@ -/* Copyright (C) 2011-2013 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3, or (at your option) -any later version. - -GCC is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. */ - -/* ISO C1X: 7.23 _Noreturn <stdnoreturn.h>. */ - -#ifndef _STDNORETURN_H -#define _STDNORETURN_H - -#ifndef __cplusplus - -#define noreturn _Noreturn - -#endif - -#endif /* stdnoreturn.h */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/tbmintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/tbmintrin.h deleted file mode 100644 index 07c4f77..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/tbmintrin.h +++ /dev/null @@ -1,172 +0,0 @@ -/* Copyright (C) 2010-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _X86INTRIN_H_INCLUDED -# error "Never use <tbmintrin.h> directly; include <x86intrin.h> instead." -#endif - -#ifndef __TBM__ -# error "TBM instruction set not enabled" -#endif /* __TBM__ */ - -#ifndef _TBMINTRIN_H_INCLUDED -#define _TBMINTRIN_H_INCLUDED - -#ifdef __OPTIMIZE__ -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__bextri_u32 (unsigned int __X, const unsigned int __I) -{ - return __builtin_ia32_bextri_u32 (__X, __I); -} -#else -#define __bextri_u32(X, I) \ - ((unsigned int)__builtin_ia32_bextri_u32 ((unsigned int)(X), \ - (unsigned int)(I))) -#endif /*__OPTIMIZE__ */ - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blcfill_u32 (unsigned int __X) -{ - return __X & (__X + 1); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blci_u32 (unsigned int __X) -{ - return __X | ~(__X + 1); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blcic_u32 (unsigned int __X) -{ - return ~__X & (__X + 1); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blcmsk_u32 (unsigned int __X) -{ - return __X ^ (__X + 1); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blcs_u32 (unsigned int __X) -{ - return __X | (__X + 1); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blsfill_u32 (unsigned int __X) -{ - return __X | (__X - 1); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blsic_u32 (unsigned int __X) -{ - return ~__X | (__X - 1); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__t1mskc_u32 (unsigned int __X) -{ - return ~__X | (__X + 1); -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__tzmsk_u32 (unsigned int __X) -{ - return ~__X & (__X - 1); -} - - - -#ifdef __x86_64__ -#ifdef __OPTIMIZE__ -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__bextri_u64 (unsigned long long __X, const unsigned int __I) -{ - return __builtin_ia32_bextri_u64 (__X, __I); -} -#else -#define __bextri_u64(X, I) \ - ((unsigned long long)__builtin_ia32_bextri_u64 ((unsigned long long)(X), \ - (unsigned long long)(I))) -#endif /*__OPTIMIZE__ */ - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blcfill_u64 (unsigned long long __X) -{ - return __X & (__X + 1); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blci_u64 (unsigned long long __X) -{ - return __X | ~(__X + 1); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blcic_u64 (unsigned long long __X) -{ - return ~__X & (__X + 1); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blcmsk_u64 (unsigned long long __X) -{ - return __X ^ (__X + 1); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blcs_u64 (unsigned long long __X) -{ - return __X | (__X + 1); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blsfill_u64 (unsigned long long __X) -{ - return __X | (__X - 1); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__blsic_u64 (unsigned long long __X) -{ - return ~__X | (__X - 1); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__t1mskc_u64 (unsigned long long __X) -{ - return ~__X | (__X + 1); -} - -extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -__tzmsk_u64 (unsigned long long __X) -{ - return ~__X & (__X - 1); -} - - -#endif /* __x86_64__ */ -#endif /* _TBMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/tmmintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/tmmintrin.h deleted file mode 100644 index 767b199..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/tmmintrin.h +++ /dev/null @@ -1,244 +0,0 @@ -/* Copyright (C) 2006-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* Implemented from the specification included in the Intel C++ Compiler - User Guide and Reference, version 9.1. */ - -#ifndef _TMMINTRIN_H_INCLUDED -#define _TMMINTRIN_H_INCLUDED - -#ifndef __SSSE3__ -# error "SSSE3 instruction set not enabled" -#else - -/* We need definitions from the SSE3, SSE2 and SSE header files*/ -#include <pmmintrin.h> - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hadd_epi16 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_phaddw128 ((__v8hi)__X, (__v8hi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hadd_epi32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_phaddd128 ((__v4si)__X, (__v4si)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hadds_epi16 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_phaddsw128 ((__v8hi)__X, (__v8hi)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hadd_pi16 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_phaddw ((__v4hi)__X, (__v4hi)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hadd_pi32 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_phaddd ((__v2si)__X, (__v2si)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hadds_pi16 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_phaddsw ((__v4hi)__X, (__v4hi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hsub_epi16 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_phsubw128 ((__v8hi)__X, (__v8hi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hsub_epi32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_phsubd128 ((__v4si)__X, (__v4si)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hsubs_epi16 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_phsubsw128 ((__v8hi)__X, (__v8hi)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hsub_pi16 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_phsubw ((__v4hi)__X, (__v4hi)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hsub_pi32 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_phsubd ((__v2si)__X, (__v2si)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hsubs_pi16 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_phsubsw ((__v4hi)__X, (__v4hi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maddubs_epi16 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pmaddubsw128 ((__v16qi)__X, (__v16qi)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maddubs_pi16 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_pmaddubsw ((__v8qi)__X, (__v8qi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mulhrs_epi16 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pmulhrsw128 ((__v8hi)__X, (__v8hi)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mulhrs_pi16 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_pmulhrsw ((__v4hi)__X, (__v4hi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shuffle_epi8 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_pshufb128 ((__v16qi)__X, (__v16qi)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shuffle_pi8 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_pshufb ((__v8qi)__X, (__v8qi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sign_epi8 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_psignb128 ((__v16qi)__X, (__v16qi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sign_epi16 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_psignw128 ((__v8hi)__X, (__v8hi)__Y); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sign_epi32 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_psignd128 ((__v4si)__X, (__v4si)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sign_pi8 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_psignb ((__v8qi)__X, (__v8qi)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sign_pi16 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_psignw ((__v4hi)__X, (__v4hi)__Y); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sign_pi32 (__m64 __X, __m64 __Y) -{ - return (__m64) __builtin_ia32_psignd ((__v2si)__X, (__v2si)__Y); -} - -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_alignr_epi8(__m128i __X, __m128i __Y, const int __N) -{ - return (__m128i) __builtin_ia32_palignr128 ((__v2di)__X, - (__v2di)__Y, __N * 8); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_alignr_pi8(__m64 __X, __m64 __Y, const int __N) -{ - return (__m64) __builtin_ia32_palignr ((__v1di)__X, - (__v1di)__Y, __N * 8); -} -#else -#define _mm_alignr_epi8(X, Y, N) \ - ((__m128i) __builtin_ia32_palignr128 ((__v2di)(__m128i)(X), \ - (__v2di)(__m128i)(Y), \ - (int)(N) * 8)) -#define _mm_alignr_pi8(X, Y, N) \ - ((__m64) __builtin_ia32_palignr ((__v1di)(__m64)(X), \ - (__v1di)(__m64)(Y), \ - (int)(N) * 8)) -#endif - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_abs_epi8 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pabsb128 ((__v16qi)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_abs_epi16 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pabsw128 ((__v8hi)__X); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_abs_epi32 (__m128i __X) -{ - return (__m128i) __builtin_ia32_pabsd128 ((__v4si)__X); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_abs_pi8 (__m64 __X) -{ - return (__m64) __builtin_ia32_pabsb ((__v8qi)__X); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_abs_pi16 (__m64 __X) -{ - return (__m64) __builtin_ia32_pabsw ((__v4hi)__X); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_abs_pi32 (__m64 __X) -{ - return (__m64) __builtin_ia32_pabsd ((__v2si)__X); -} - -#endif /* __SSSE3__ */ - -#endif /* _TMMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/unwind.h b/lib/gcc/x86_64-linux-android/4.8/include/unwind.h deleted file mode 100644 index b8d78b9..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/unwind.h +++ /dev/null @@ -1,293 +0,0 @@ -/* Exception handling and frame unwind runtime interface routines. - Copyright (C) 2001-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify it - under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, but WITHOUT - ANY WARRANTY; without even the implied warranty of MERCHANTABILITY - or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public - License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* This is derived from the C++ ABI for IA-64. Where we diverge - for cross-architecture compatibility are noted with "@@@". */ - -#ifndef _UNWIND_H -#define _UNWIND_H - -#if defined (__SEH__) && !defined (__USING_SJLJ_EXCEPTIONS__) -/* Only for _GCC_specific_handler. */ -#include <windows.h> -#endif - -#ifndef HIDE_EXPORTS -#pragma GCC visibility push(default) -#endif - -#ifdef __cplusplus -extern "C" { -#endif - -/* Level 1: Base ABI */ - -/* @@@ The IA-64 ABI uses uint64 throughout. Most places this is - inefficient for 32-bit and smaller machines. */ -typedef unsigned _Unwind_Word __attribute__((__mode__(__unwind_word__))); -typedef signed _Unwind_Sword __attribute__((__mode__(__unwind_word__))); -#if defined(__ia64__) && defined(__hpux__) -typedef unsigned _Unwind_Ptr __attribute__((__mode__(__word__))); -#else -typedef unsigned _Unwind_Ptr __attribute__((__mode__(__pointer__))); -#endif -typedef unsigned _Unwind_Internal_Ptr __attribute__((__mode__(__pointer__))); - -/* @@@ The IA-64 ABI uses a 64-bit word to identify the producer and - consumer of an exception. We'll go along with this for now even on - 32-bit machines. We'll need to provide some other option for - 16-bit machines and for machines with > 8 bits per byte. */ -typedef unsigned _Unwind_Exception_Class __attribute__((__mode__(__DI__))); - -/* The unwind interface uses reason codes in several contexts to - identify the reasons for failures or other actions. */ -typedef enum -{ - _URC_NO_REASON = 0, - _URC_FOREIGN_EXCEPTION_CAUGHT = 1, - _URC_FATAL_PHASE2_ERROR = 2, - _URC_FATAL_PHASE1_ERROR = 3, - _URC_NORMAL_STOP = 4, - _URC_END_OF_STACK = 5, - _URC_HANDLER_FOUND = 6, - _URC_INSTALL_CONTEXT = 7, - _URC_CONTINUE_UNWIND = 8 -} _Unwind_Reason_Code; - - -/* The unwind interface uses a pointer to an exception header object - as its representation of an exception being thrown. In general, the - full representation of an exception object is language- and - implementation-specific, but it will be prefixed by a header - understood by the unwind interface. */ - -struct _Unwind_Exception; - -typedef void (*_Unwind_Exception_Cleanup_Fn) (_Unwind_Reason_Code, - struct _Unwind_Exception *); - -struct _Unwind_Exception -{ - _Unwind_Exception_Class exception_class; - _Unwind_Exception_Cleanup_Fn exception_cleanup; - -#if !defined (__USING_SJLJ_EXCEPTIONS__) && defined (__SEH__) - _Unwind_Word private_[6]; -#else - _Unwind_Word private_1; - _Unwind_Word private_2; -#endif - - /* @@@ The IA-64 ABI says that this structure must be double-word aligned. - Taking that literally does not make much sense generically. Instead we - provide the maximum alignment required by any type for the machine. */ -} __attribute__((__aligned__)); - - -/* The ACTIONS argument to the personality routine is a bitwise OR of one - or more of the following constants. */ -typedef int _Unwind_Action; - -#define _UA_SEARCH_PHASE 1 -#define _UA_CLEANUP_PHASE 2 -#define _UA_HANDLER_FRAME 4 -#define _UA_FORCE_UNWIND 8 -#define _UA_END_OF_STACK 16 - -/* The target can override this macro to define any back-end-specific - attributes required for the lowest-level stack frame. */ -#ifndef LIBGCC2_UNWIND_ATTRIBUTE -#define LIBGCC2_UNWIND_ATTRIBUTE -#endif - -/* This is an opaque type used to refer to a system-specific data - structure used by the system unwinder. This context is created and - destroyed by the system, and passed to the personality routine - during unwinding. */ -struct _Unwind_Context; - -/* Raise an exception, passing along the given exception object. */ -extern _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE -_Unwind_RaiseException (struct _Unwind_Exception *); - -/* Raise an exception for forced unwinding. */ - -typedef _Unwind_Reason_Code (*_Unwind_Stop_Fn) - (int, _Unwind_Action, _Unwind_Exception_Class, - struct _Unwind_Exception *, struct _Unwind_Context *, void *); - -extern _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE -_Unwind_ForcedUnwind (struct _Unwind_Exception *, _Unwind_Stop_Fn, void *); - -/* Helper to invoke the exception_cleanup routine. */ -extern void _Unwind_DeleteException (struct _Unwind_Exception *); - -/* Resume propagation of an existing exception. This is used after - e.g. executing cleanup code, and not to implement rethrowing. */ -extern void LIBGCC2_UNWIND_ATTRIBUTE -_Unwind_Resume (struct _Unwind_Exception *); - -/* @@@ Resume propagation of a FORCE_UNWIND exception, or to rethrow - a normal exception that was handled. */ -extern _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE -_Unwind_Resume_or_Rethrow (struct _Unwind_Exception *); - -/* @@@ Use unwind data to perform a stack backtrace. The trace callback - is called for every stack frame in the call chain, but no cleanup - actions are performed. */ -typedef _Unwind_Reason_Code (*_Unwind_Trace_Fn) - (struct _Unwind_Context *, void *); - -extern _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE -_Unwind_Backtrace (_Unwind_Trace_Fn, void *); - -/* These functions are used for communicating information about the unwind - context (i.e. the unwind descriptors and the user register state) between - the unwind library and the personality routine and landing pad. Only - selected registers may be manipulated. */ - -extern _Unwind_Word _Unwind_GetGR (struct _Unwind_Context *, int); -extern void _Unwind_SetGR (struct _Unwind_Context *, int, _Unwind_Word); - -extern _Unwind_Ptr _Unwind_GetIP (struct _Unwind_Context *); -extern _Unwind_Ptr _Unwind_GetIPInfo (struct _Unwind_Context *, int *); -extern void _Unwind_SetIP (struct _Unwind_Context *, _Unwind_Ptr); - -/* @@@ Retrieve the CFA of the given context. */ -extern _Unwind_Word _Unwind_GetCFA (struct _Unwind_Context *); - -extern void *_Unwind_GetLanguageSpecificData (struct _Unwind_Context *); - -extern _Unwind_Ptr _Unwind_GetRegionStart (struct _Unwind_Context *); - - -/* The personality routine is the function in the C++ (or other language) - runtime library which serves as an interface between the system unwind - library and language-specific exception handling semantics. It is - specific to the code fragment described by an unwind info block, and - it is always referenced via the pointer in the unwind info block, and - hence it has no ABI-specified name. - - Note that this implies that two different C++ implementations can - use different names, and have different contents in the language - specific data area. Moreover, that the language specific data - area contains no version info because name of the function invoked - provides more effective versioning by detecting at link time the - lack of code to handle the different data format. */ - -typedef _Unwind_Reason_Code (*_Unwind_Personality_Fn) - (int, _Unwind_Action, _Unwind_Exception_Class, - struct _Unwind_Exception *, struct _Unwind_Context *); - -/* @@@ The following alternate entry points are for setjmp/longjmp - based unwinding. */ - -struct SjLj_Function_Context; -extern void _Unwind_SjLj_Register (struct SjLj_Function_Context *); -extern void _Unwind_SjLj_Unregister (struct SjLj_Function_Context *); - -extern _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE -_Unwind_SjLj_RaiseException (struct _Unwind_Exception *); -extern _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE -_Unwind_SjLj_ForcedUnwind (struct _Unwind_Exception *, _Unwind_Stop_Fn, void *); -extern void LIBGCC2_UNWIND_ATTRIBUTE -_Unwind_SjLj_Resume (struct _Unwind_Exception *); -extern _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE -_Unwind_SjLj_Resume_or_Rethrow (struct _Unwind_Exception *); - -/* @@@ The following provide access to the base addresses for text - and data-relative addressing in the LDSA. In order to stay link - compatible with the standard ABI for IA-64, we inline these. */ - -#ifdef __ia64__ -#include <stdlib.h> - -static inline _Unwind_Ptr -_Unwind_GetDataRelBase (struct _Unwind_Context *_C) -{ - /* The GP is stored in R1. */ - return _Unwind_GetGR (_C, 1); -} - -static inline _Unwind_Ptr -_Unwind_GetTextRelBase (struct _Unwind_Context *_C __attribute__ ((__unused__))) -{ - abort (); - return 0; -} - -/* @@@ Retrieve the Backing Store Pointer of the given context. */ -extern _Unwind_Word _Unwind_GetBSP (struct _Unwind_Context *); -#else -extern _Unwind_Ptr _Unwind_GetDataRelBase (struct _Unwind_Context *); -extern _Unwind_Ptr _Unwind_GetTextRelBase (struct _Unwind_Context *); -#endif - -/* @@@ Given an address, return the entry point of the function that - contains it. */ -extern void * _Unwind_FindEnclosingFunction (void *pc); - -#ifndef __SIZEOF_LONG__ - #error "__SIZEOF_LONG__ macro not defined" -#endif - -#ifndef __SIZEOF_POINTER__ - #error "__SIZEOF_POINTER__ macro not defined" -#endif - - -/* leb128 type numbers have a potentially unlimited size. - The target of the following definitions of _sleb128_t and _uleb128_t - is to have efficient data types large enough to hold the leb128 type - numbers used in the unwind code. - Mostly these types will simply be defined to long and unsigned long - except when a unsigned long data type on the target machine is not - capable of storing a pointer. */ - -#if __SIZEOF_LONG__ >= __SIZEOF_POINTER__ - typedef long _sleb128_t; - typedef unsigned long _uleb128_t; -#elif __SIZEOF_LONG_LONG__ >= __SIZEOF_POINTER__ - typedef long long _sleb128_t; - typedef unsigned long long _uleb128_t; -#else -# error "What type shall we use for _sleb128_t?" -#endif - -#if defined (__SEH__) && !defined (__USING_SJLJ_EXCEPTIONS__) -/* Handles the mapping from SEH to GCC interfaces. */ -EXCEPTION_DISPOSITION _GCC_specific_handler (PEXCEPTION_RECORD, void *, - PCONTEXT, PDISPATCHER_CONTEXT, - _Unwind_Personality_Fn); -#endif - -#ifdef __cplusplus -} -#endif - -#ifndef HIDE_EXPORTS -#pragma GCC visibility pop -#endif - -#endif /* unwind.h */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/varargs.h b/lib/gcc/x86_64-linux-android/4.8/include/varargs.h deleted file mode 100644 index 4b9803e..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/varargs.h +++ /dev/null @@ -1,7 +0,0 @@ -#ifndef _VARARGS_H -#define _VARARGS_H - -#error "GCC no longer implements <varargs.h>." -#error "Revise your code to use <stdarg.h>." - -#endif diff --git a/lib/gcc/x86_64-linux-android/4.8/include/wmmintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/wmmintrin.h deleted file mode 100644 index 93c24f4..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/wmmintrin.h +++ /dev/null @@ -1,120 +0,0 @@ -/* Copyright (C) 2008-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* Implemented from the specification included in the Intel C++ Compiler - User Guide and Reference, version 10.1. */ - -#ifndef _WMMINTRIN_H_INCLUDED -#define _WMMINTRIN_H_INCLUDED - -/* We need definitions from the SSE2 header file. */ -#include <emmintrin.h> - -#if !defined (__AES__) && !defined (__PCLMUL__) -# error "AES/PCLMUL instructions not enabled" -#else - -/* AES */ - -#ifdef __AES__ -/* Performs 1 round of AES decryption of the first m128i using - the second m128i as a round key. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_aesdec_si128 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_aesdec128 ((__v2di)__X, (__v2di)__Y); -} - -/* Performs the last round of AES decryption of the first m128i - using the second m128i as a round key. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_aesdeclast_si128 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_aesdeclast128 ((__v2di)__X, - (__v2di)__Y); -} - -/* Performs 1 round of AES encryption of the first m128i using - the second m128i as a round key. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_aesenc_si128 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_aesenc128 ((__v2di)__X, (__v2di)__Y); -} - -/* Performs the last round of AES encryption of the first m128i - using the second m128i as a round key. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_aesenclast_si128 (__m128i __X, __m128i __Y) -{ - return (__m128i) __builtin_ia32_aesenclast128 ((__v2di)__X, (__v2di)__Y); -} - -/* Performs the InverseMixColumn operation on the source m128i - and stores the result into m128i destination. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_aesimc_si128 (__m128i __X) -{ - return (__m128i) __builtin_ia32_aesimc128 ((__v2di)__X); -} - -/* Generates a m128i round key for the input m128i AES cipher key and - byte round constant. The second parameter must be a compile time - constant. */ -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_aeskeygenassist_si128 (__m128i __X, const int __C) -{ - return (__m128i) __builtin_ia32_aeskeygenassist128 ((__v2di)__X, __C); -} -#else -#define _mm_aeskeygenassist_si128(X, C) \ - ((__m128i) __builtin_ia32_aeskeygenassist128 ((__v2di)(__m128i)(X), \ - (int)(C))) -#endif -#endif /* __AES__ */ - -/* PCLMUL */ - -#ifdef __PCLMUL__ -/* Performs carry-less integer multiplication of 64-bit halves of - 128-bit input operands. The third parameter inducates which 64-bit - haves of the input parameters v1 and v2 should be used. It must be - a compile time constant. */ -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_clmulepi64_si128 (__m128i __X, __m128i __Y, const int __I) -{ - return (__m128i) __builtin_ia32_pclmulqdq128 ((__v2di)__X, - (__v2di)__Y, __I); -} -#else -#define _mm_clmulepi64_si128(X, Y, I) \ - ((__m128i) __builtin_ia32_pclmulqdq128 ((__v2di)(__m128i)(X), \ - (__v2di)(__m128i)(Y), (int)(I))) -#endif -#endif /* __PCLMUL__ */ - -#endif /* __AES__/__PCLMUL__ */ - -#endif /* _WMMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/x86intrin.h b/lib/gcc/x86_64-linux-android/4.8/include/x86intrin.h deleted file mode 100644 index 5bf29d5..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/x86intrin.h +++ /dev/null @@ -1,122 +0,0 @@ -/* Copyright (C) 2008-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _X86INTRIN_H_INCLUDED -#define _X86INTRIN_H_INCLUDED - -#include <ia32intrin.h> - -#ifdef __MMX__ -#include <mmintrin.h> -#endif - -#ifdef __SSE__ -#include <xmmintrin.h> -#endif - -#ifdef __SSE2__ -#include <emmintrin.h> -#endif - -#ifdef __SSE3__ -#include <pmmintrin.h> -#endif - -#ifdef __SSSE3__ -#include <tmmintrin.h> -#endif - -#ifdef __SSE4A__ -#include <ammintrin.h> -#endif - -#if defined (__SSE4_2__) || defined (__SSE4_1__) -#include <smmintrin.h> -#endif - -#if defined (__AES__) || defined (__PCLMUL__) -#include <wmmintrin.h> -#endif - -/* For including AVX instructions */ -#include <immintrin.h> - -#ifdef __3dNOW__ -#include <mm3dnow.h> -#endif - -#ifdef __FMA4__ -#include <fma4intrin.h> -#endif - -#ifdef __XOP__ -#include <xopintrin.h> -#endif - -#ifdef __LWP__ -#include <lwpintrin.h> -#endif - -#ifdef __BMI__ -#include <bmiintrin.h> -#endif - -#ifdef __BMI2__ -#include <bmi2intrin.h> -#endif - -#ifdef __TBM__ -#include <tbmintrin.h> -#endif - -#ifdef __LZCNT__ -#include <lzcntintrin.h> -#endif - -#ifdef __POPCNT__ -#include <popcntintrin.h> -#endif - -#ifdef __RDSEED__ -#include <rdseedintrin.h> -#endif - -#ifdef __PRFCHW__ -#include <prfchwintrin.h> -#endif - -#ifdef __FXSR__ -#include <fxsrintrin.h> -#endif - -#ifdef __XSAVE__ -#include <xsaveintrin.h> -#endif - -#ifdef __XSAVEOPT__ -#include <xsaveoptintrin.h> -#endif - -#include <adxintrin.h> - -#endif /* _X86INTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/xmmintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/xmmintrin.h deleted file mode 100644 index a223562..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/xmmintrin.h +++ /dev/null @@ -1,1250 +0,0 @@ -/* Copyright (C) 2002-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* Implemented from the specification included in the Intel C++ Compiler - User Guide and Reference, version 9.0. */ - -#ifndef _XMMINTRIN_H_INCLUDED -#define _XMMINTRIN_H_INCLUDED - -#ifndef __SSE__ -# error "SSE instruction set not enabled" -#else - -/* We need type definitions from the MMX header file. */ -#include <mmintrin.h> - -/* Get _mm_malloc () and _mm_free (). */ -#include <mm_malloc.h> - -/* The Intel API is flexible enough that we must allow aliasing with other - vector types, and their scalar components. */ -typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); - -/* Internal data types for implementing the intrinsics. */ -typedef float __v4sf __attribute__ ((__vector_size__ (16))); - -/* Create a selector for use with the SHUFPS instruction. */ -#define _MM_SHUFFLE(fp3,fp2,fp1,fp0) \ - (((fp3) << 6) | ((fp2) << 4) | ((fp1) << 2) | (fp0)) - -/* Constants for use with _mm_prefetch. */ -enum _mm_hint -{ - _MM_HINT_T0 = 3, - _MM_HINT_T1 = 2, - _MM_HINT_T2 = 1, - _MM_HINT_NTA = 0 -}; - -/* Bits in the MXCSR. */ -#define _MM_EXCEPT_MASK 0x003f -#define _MM_EXCEPT_INVALID 0x0001 -#define _MM_EXCEPT_DENORM 0x0002 -#define _MM_EXCEPT_DIV_ZERO 0x0004 -#define _MM_EXCEPT_OVERFLOW 0x0008 -#define _MM_EXCEPT_UNDERFLOW 0x0010 -#define _MM_EXCEPT_INEXACT 0x0020 - -#define _MM_MASK_MASK 0x1f80 -#define _MM_MASK_INVALID 0x0080 -#define _MM_MASK_DENORM 0x0100 -#define _MM_MASK_DIV_ZERO 0x0200 -#define _MM_MASK_OVERFLOW 0x0400 -#define _MM_MASK_UNDERFLOW 0x0800 -#define _MM_MASK_INEXACT 0x1000 - -#define _MM_ROUND_MASK 0x6000 -#define _MM_ROUND_NEAREST 0x0000 -#define _MM_ROUND_DOWN 0x2000 -#define _MM_ROUND_UP 0x4000 -#define _MM_ROUND_TOWARD_ZERO 0x6000 - -#define _MM_FLUSH_ZERO_MASK 0x8000 -#define _MM_FLUSH_ZERO_ON 0x8000 -#define _MM_FLUSH_ZERO_OFF 0x0000 - -/* Create a vector of zeros. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setzero_ps (void) -{ - return __extension__ (__m128){ 0.0f, 0.0f, 0.0f, 0.0f }; -} - -/* Perform the respective operation on the lower SPFP (single-precision - floating-point) values of A and B; the upper three SPFP values are - passed through from A. */ - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_addss ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_subss ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mul_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_mulss ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_div_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_divss ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sqrt_ss (__m128 __A) -{ - return (__m128) __builtin_ia32_sqrtss ((__v4sf)__A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_rcp_ss (__m128 __A) -{ - return (__m128) __builtin_ia32_rcpss ((__v4sf)__A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_rsqrt_ss (__m128 __A) -{ - return (__m128) __builtin_ia32_rsqrtss ((__v4sf)__A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_minss ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_maxss ((__v4sf)__A, (__v4sf)__B); -} - -/* Perform the respective operation on the four SPFP values in A and B. */ - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_add_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_addps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sub_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_subps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mul_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_mulps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_div_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_divps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sqrt_ps (__m128 __A) -{ - return (__m128) __builtin_ia32_sqrtps ((__v4sf)__A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_rcp_ps (__m128 __A) -{ - return (__m128) __builtin_ia32_rcpps ((__v4sf)__A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_rsqrt_ps (__m128 __A) -{ - return (__m128) __builtin_ia32_rsqrtps ((__v4sf)__A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_minps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_maxps ((__v4sf)__A, (__v4sf)__B); -} - -/* Perform logical bit-wise operations on 128-bit values. */ - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_and_ps (__m128 __A, __m128 __B) -{ - return __builtin_ia32_andps (__A, __B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_andnot_ps (__m128 __A, __m128 __B) -{ - return __builtin_ia32_andnps (__A, __B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_or_ps (__m128 __A, __m128 __B) -{ - return __builtin_ia32_orps (__A, __B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_xor_ps (__m128 __A, __m128 __B) -{ - return __builtin_ia32_xorps (__A, __B); -} - -/* Perform a comparison on the lower SPFP values of A and B. If the - comparison is true, place a mask of all ones in the result, otherwise a - mask of zeros. The upper three SPFP values are passed through from A. */ - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpeq_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpeqss ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmplt_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpltss ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmple_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpless ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpgt_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_movss ((__v4sf) __A, - (__v4sf) - __builtin_ia32_cmpltss ((__v4sf) __B, - (__v4sf) - __A)); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpge_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_movss ((__v4sf) __A, - (__v4sf) - __builtin_ia32_cmpless ((__v4sf) __B, - (__v4sf) - __A)); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpneq_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpneqss ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnlt_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpnltss ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnle_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpnless ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpngt_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_movss ((__v4sf) __A, - (__v4sf) - __builtin_ia32_cmpnltss ((__v4sf) __B, - (__v4sf) - __A)); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnge_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_movss ((__v4sf) __A, - (__v4sf) - __builtin_ia32_cmpnless ((__v4sf) __B, - (__v4sf) - __A)); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpord_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpordss ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpunord_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpunordss ((__v4sf)__A, (__v4sf)__B); -} - -/* Perform a comparison on the four SPFP values of A and B. For each - element, if the comparison is true, place a mask of all ones in the - result, otherwise a mask of zeros. */ - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpeq_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpeqps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmplt_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpltps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmple_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpleps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpgt_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpgtps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpge_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpgeps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpneq_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpneqps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnlt_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpnltps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnle_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpnleps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpngt_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpngtps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpnge_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpngeps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpord_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpordps ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmpunord_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_cmpunordps ((__v4sf)__A, (__v4sf)__B); -} - -/* Compare the lower SPFP values of A and B and return 1 if true - and 0 if false. */ - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comieq_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_comieq ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comilt_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_comilt ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comile_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_comile ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comigt_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_comigt ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comige_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_comige ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comineq_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_comineq ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomieq_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_ucomieq ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomilt_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_ucomilt ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomile_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_ucomile ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomigt_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_ucomigt ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomige_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_ucomige ((__v4sf)__A, (__v4sf)__B); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ucomineq_ss (__m128 __A, __m128 __B) -{ - return __builtin_ia32_ucomineq ((__v4sf)__A, (__v4sf)__B); -} - -/* Convert the lower SPFP value to a 32-bit integer according to the current - rounding mode. */ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtss_si32 (__m128 __A) -{ - return __builtin_ia32_cvtss2si ((__v4sf) __A); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvt_ss2si (__m128 __A) -{ - return _mm_cvtss_si32 (__A); -} - -#ifdef __x86_64__ -/* Convert the lower SPFP value to a 32-bit integer according to the - current rounding mode. */ - -/* Intel intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtss_si64 (__m128 __A) -{ - return __builtin_ia32_cvtss2si64 ((__v4sf) __A); -} - -/* Microsoft intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtss_si64x (__m128 __A) -{ - return __builtin_ia32_cvtss2si64 ((__v4sf) __A); -} -#endif - -/* Convert the two lower SPFP values to 32-bit integers according to the - current rounding mode. Return the integers in packed form. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtps_pi32 (__m128 __A) -{ - return (__m64) __builtin_ia32_cvtps2pi ((__v4sf) __A); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvt_ps2pi (__m128 __A) -{ - return _mm_cvtps_pi32 (__A); -} - -/* Truncate the lower SPFP value to a 32-bit integer. */ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvttss_si32 (__m128 __A) -{ - return __builtin_ia32_cvttss2si ((__v4sf) __A); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtt_ss2si (__m128 __A) -{ - return _mm_cvttss_si32 (__A); -} - -#ifdef __x86_64__ -/* Truncate the lower SPFP value to a 32-bit integer. */ - -/* Intel intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvttss_si64 (__m128 __A) -{ - return __builtin_ia32_cvttss2si64 ((__v4sf) __A); -} - -/* Microsoft intrinsic. */ -extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvttss_si64x (__m128 __A) -{ - return __builtin_ia32_cvttss2si64 ((__v4sf) __A); -} -#endif - -/* Truncate the two lower SPFP values to 32-bit integers. Return the - integers in packed form. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvttps_pi32 (__m128 __A) -{ - return (__m64) __builtin_ia32_cvttps2pi ((__v4sf) __A); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtt_ps2pi (__m128 __A) -{ - return _mm_cvttps_pi32 (__A); -} - -/* Convert B to a SPFP value and insert it as element zero in A. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi32_ss (__m128 __A, int __B) -{ - return (__m128) __builtin_ia32_cvtsi2ss ((__v4sf) __A, __B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvt_si2ss (__m128 __A, int __B) -{ - return _mm_cvtsi32_ss (__A, __B); -} - -#ifdef __x86_64__ -/* Convert B to a SPFP value and insert it as element zero in A. */ - -/* Intel intrinsic. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi64_ss (__m128 __A, long long __B) -{ - return (__m128) __builtin_ia32_cvtsi642ss ((__v4sf) __A, __B); -} - -/* Microsoft intrinsic. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtsi64x_ss (__m128 __A, long long __B) -{ - return (__m128) __builtin_ia32_cvtsi642ss ((__v4sf) __A, __B); -} -#endif - -/* Convert the two 32-bit values in B to SPFP form and insert them - as the two lower elements in A. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtpi32_ps (__m128 __A, __m64 __B) -{ - return (__m128) __builtin_ia32_cvtpi2ps ((__v4sf) __A, (__v2si)__B); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvt_pi2ps (__m128 __A, __m64 __B) -{ - return _mm_cvtpi32_ps (__A, __B); -} - -/* Convert the four signed 16-bit values in A to SPFP form. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtpi16_ps (__m64 __A) -{ - __v4hi __sign; - __v2si __hisi, __losi; - __v4sf __zero, __ra, __rb; - - /* This comparison against zero gives us a mask that can be used to - fill in the missing sign bits in the unpack operations below, so - that we get signed values after unpacking. */ - __sign = __builtin_ia32_pcmpgtw ((__v4hi)0LL, (__v4hi)__A); - - /* Convert the four words to doublewords. */ - __losi = (__v2si) __builtin_ia32_punpcklwd ((__v4hi)__A, __sign); - __hisi = (__v2si) __builtin_ia32_punpckhwd ((__v4hi)__A, __sign); - - /* Convert the doublewords to floating point two at a time. */ - __zero = (__v4sf) _mm_setzero_ps (); - __ra = __builtin_ia32_cvtpi2ps (__zero, __losi); - __rb = __builtin_ia32_cvtpi2ps (__ra, __hisi); - - return (__m128) __builtin_ia32_movlhps (__ra, __rb); -} - -/* Convert the four unsigned 16-bit values in A to SPFP form. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtpu16_ps (__m64 __A) -{ - __v2si __hisi, __losi; - __v4sf __zero, __ra, __rb; - - /* Convert the four words to doublewords. */ - __losi = (__v2si) __builtin_ia32_punpcklwd ((__v4hi)__A, (__v4hi)0LL); - __hisi = (__v2si) __builtin_ia32_punpckhwd ((__v4hi)__A, (__v4hi)0LL); - - /* Convert the doublewords to floating point two at a time. */ - __zero = (__v4sf) _mm_setzero_ps (); - __ra = __builtin_ia32_cvtpi2ps (__zero, __losi); - __rb = __builtin_ia32_cvtpi2ps (__ra, __hisi); - - return (__m128) __builtin_ia32_movlhps (__ra, __rb); -} - -/* Convert the low four signed 8-bit values in A to SPFP form. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtpi8_ps (__m64 __A) -{ - __v8qi __sign; - - /* This comparison against zero gives us a mask that can be used to - fill in the missing sign bits in the unpack operations below, so - that we get signed values after unpacking. */ - __sign = __builtin_ia32_pcmpgtb ((__v8qi)0LL, (__v8qi)__A); - - /* Convert the four low bytes to words. */ - __A = (__m64) __builtin_ia32_punpcklbw ((__v8qi)__A, __sign); - - return _mm_cvtpi16_ps(__A); -} - -/* Convert the low four unsigned 8-bit values in A to SPFP form. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtpu8_ps(__m64 __A) -{ - __A = (__m64) __builtin_ia32_punpcklbw ((__v8qi)__A, (__v8qi)0LL); - return _mm_cvtpu16_ps(__A); -} - -/* Convert the four signed 32-bit values in A and B to SPFP form. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtpi32x2_ps(__m64 __A, __m64 __B) -{ - __v4sf __zero = (__v4sf) _mm_setzero_ps (); - __v4sf __sfa = __builtin_ia32_cvtpi2ps (__zero, (__v2si)__A); - __v4sf __sfb = __builtin_ia32_cvtpi2ps (__sfa, (__v2si)__B); - return (__m128) __builtin_ia32_movlhps (__sfa, __sfb); -} - -/* Convert the four SPFP values in A to four signed 16-bit integers. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtps_pi16(__m128 __A) -{ - __v4sf __hisf = (__v4sf)__A; - __v4sf __losf = __builtin_ia32_movhlps (__hisf, __hisf); - __v2si __hisi = __builtin_ia32_cvtps2pi (__hisf); - __v2si __losi = __builtin_ia32_cvtps2pi (__losf); - return (__m64) __builtin_ia32_packssdw (__hisi, __losi); -} - -/* Convert the four SPFP values in A to four signed 8-bit integers. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtps_pi8(__m128 __A) -{ - __v4hi __tmp = (__v4hi) _mm_cvtps_pi16 (__A); - return (__m64) __builtin_ia32_packsswb (__tmp, (__v4hi)0LL); -} - -/* Selects four specific SPFP values from A and B based on MASK. */ -#ifdef __OPTIMIZE__ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shuffle_ps (__m128 __A, __m128 __B, int const __mask) -{ - return (__m128) __builtin_ia32_shufps ((__v4sf)__A, (__v4sf)__B, __mask); -} -#else -#define _mm_shuffle_ps(A, B, MASK) \ - ((__m128) __builtin_ia32_shufps ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(MASK))) -#endif - -/* Selects and interleaves the upper two SPFP values from A and B. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpackhi_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_unpckhps ((__v4sf)__A, (__v4sf)__B); -} - -/* Selects and interleaves the lower two SPFP values from A and B. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_unpacklo_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_unpcklps ((__v4sf)__A, (__v4sf)__B); -} - -/* Sets the upper two SPFP values with 64-bits of data loaded from P; - the lower two values are passed through from A. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_loadh_pi (__m128 __A, __m64 const *__P) -{ - return (__m128) __builtin_ia32_loadhps ((__v4sf)__A, (const __v2sf *)__P); -} - -/* Stores the upper two SPFP values of A into P. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_storeh_pi (__m64 *__P, __m128 __A) -{ - __builtin_ia32_storehps ((__v2sf *)__P, (__v4sf)__A); -} - -/* Moves the upper two values of B into the lower two values of A. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_movehl_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_movhlps ((__v4sf)__A, (__v4sf)__B); -} - -/* Moves the lower two values of B into the upper two values of A. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_movelh_ps (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_movlhps ((__v4sf)__A, (__v4sf)__B); -} - -/* Sets the lower two SPFP values with 64-bits of data loaded from P; - the upper two values are passed through from A. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_loadl_pi (__m128 __A, __m64 const *__P) -{ - return (__m128) __builtin_ia32_loadlps ((__v4sf)__A, (const __v2sf *)__P); -} - -/* Stores the lower two SPFP values of A into P. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_storel_pi (__m64 *__P, __m128 __A) -{ - __builtin_ia32_storelps ((__v2sf *)__P, (__v4sf)__A); -} - -/* Creates a 4-bit mask from the most significant bits of the SPFP values. */ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_movemask_ps (__m128 __A) -{ - return __builtin_ia32_movmskps ((__v4sf)__A); -} - -/* Return the contents of the control register. */ -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_getcsr (void) -{ - return __builtin_ia32_stmxcsr (); -} - -/* Read exception bits from the control register. */ -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_MM_GET_EXCEPTION_STATE (void) -{ - return _mm_getcsr() & _MM_EXCEPT_MASK; -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_MM_GET_EXCEPTION_MASK (void) -{ - return _mm_getcsr() & _MM_MASK_MASK; -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_MM_GET_ROUNDING_MODE (void) -{ - return _mm_getcsr() & _MM_ROUND_MASK; -} - -extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_MM_GET_FLUSH_ZERO_MODE (void) -{ - return _mm_getcsr() & _MM_FLUSH_ZERO_MASK; -} - -/* Set the control register to I. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setcsr (unsigned int __I) -{ - __builtin_ia32_ldmxcsr (__I); -} - -/* Set exception bits in the control register. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_MM_SET_EXCEPTION_STATE(unsigned int __mask) -{ - _mm_setcsr((_mm_getcsr() & ~_MM_EXCEPT_MASK) | __mask); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_MM_SET_EXCEPTION_MASK (unsigned int __mask) -{ - _mm_setcsr((_mm_getcsr() & ~_MM_MASK_MASK) | __mask); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_MM_SET_ROUNDING_MODE (unsigned int __mode) -{ - _mm_setcsr((_mm_getcsr() & ~_MM_ROUND_MASK) | __mode); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_MM_SET_FLUSH_ZERO_MODE (unsigned int __mode) -{ - _mm_setcsr((_mm_getcsr() & ~_MM_FLUSH_ZERO_MASK) | __mode); -} - -/* Create a vector with element 0 as F and the rest zero. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_ss (float __F) -{ - return __extension__ (__m128)(__v4sf){ __F, 0.0f, 0.0f, 0.0f }; -} - -/* Create a vector with all four elements equal to F. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set1_ps (float __F) -{ - return __extension__ (__m128)(__v4sf){ __F, __F, __F, __F }; -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_ps1 (float __F) -{ - return _mm_set1_ps (__F); -} - -/* Create a vector with element 0 as *P and the rest zero. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_load_ss (float const *__P) -{ - return _mm_set_ss (*__P); -} - -/* Create a vector with all four elements equal to *P. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_load1_ps (float const *__P) -{ - return _mm_set1_ps (*__P); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_load_ps1 (float const *__P) -{ - return _mm_load1_ps (__P); -} - -/* Load four SPFP values from P. The address must be 16-byte aligned. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_load_ps (float const *__P) -{ - return (__m128) *(__v4sf *)__P; -} - -/* Load four SPFP values from P. The address need not be 16-byte aligned. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_loadu_ps (float const *__P) -{ - return (__m128) __builtin_ia32_loadups (__P); -} - -/* Load four SPFP values in reverse order. The address must be aligned. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_loadr_ps (float const *__P) -{ - __v4sf __tmp = *(__v4sf *)__P; - return (__m128) __builtin_ia32_shufps (__tmp, __tmp, _MM_SHUFFLE (0,1,2,3)); -} - -/* Create the vector [Z Y X W]. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_set_ps (const float __Z, const float __Y, const float __X, const float __W) -{ - return __extension__ (__m128)(__v4sf){ __W, __X, __Y, __Z }; -} - -/* Create the vector [W X Y Z]. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_setr_ps (float __Z, float __Y, float __X, float __W) -{ - return __extension__ (__m128)(__v4sf){ __Z, __Y, __X, __W }; -} - -/* Stores the lower SPFP value. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_store_ss (float *__P, __m128 __A) -{ - *__P = __builtin_ia32_vec_ext_v4sf ((__v4sf)__A, 0); -} - -extern __inline float __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cvtss_f32 (__m128 __A) -{ - return __builtin_ia32_vec_ext_v4sf ((__v4sf)__A, 0); -} - -/* Store four SPFP values. The address must be 16-byte aligned. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_store_ps (float *__P, __m128 __A) -{ - *(__v4sf *)__P = (__v4sf)__A; -} - -/* Store four SPFP values. The address need not be 16-byte aligned. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_storeu_ps (float *__P, __m128 __A) -{ - __builtin_ia32_storeups (__P, (__v4sf)__A); -} - -/* Store the lower SPFP value across four words. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_store1_ps (float *__P, __m128 __A) -{ - __v4sf __va = (__v4sf)__A; - __v4sf __tmp = __builtin_ia32_shufps (__va, __va, _MM_SHUFFLE (0,0,0,0)); - _mm_storeu_ps (__P, __tmp); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_store_ps1 (float *__P, __m128 __A) -{ - _mm_store1_ps (__P, __A); -} - -/* Store four SPFP values in reverse order. The address must be aligned. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_storer_ps (float *__P, __m128 __A) -{ - __v4sf __va = (__v4sf)__A; - __v4sf __tmp = __builtin_ia32_shufps (__va, __va, _MM_SHUFFLE (0,1,2,3)); - _mm_store_ps (__P, __tmp); -} - -/* Sets the low SPFP value of A from the low value of B. */ -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_move_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_movss ((__v4sf)__A, (__v4sf)__B); -} - -/* Extracts one of the four words of A. The selector N must be immediate. */ -#ifdef __OPTIMIZE__ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_extract_pi16 (__m64 const __A, int const __N) -{ - return __builtin_ia32_vec_ext_v4hi ((__v4hi)__A, __N); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pextrw (__m64 const __A, int const __N) -{ - return _mm_extract_pi16 (__A, __N); -} -#else -#define _mm_extract_pi16(A, N) \ - ((int) __builtin_ia32_vec_ext_v4hi ((__v4hi)(__m64)(A), (int)(N))) - -#define _m_pextrw(A, N) _mm_extract_pi16(A, N) -#endif - -/* Inserts word D into one of four words of A. The selector N must be - immediate. */ -#ifdef __OPTIMIZE__ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_insert_pi16 (__m64 const __A, int const __D, int const __N) -{ - return (__m64) __builtin_ia32_vec_set_v4hi ((__v4hi)__A, __D, __N); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pinsrw (__m64 const __A, int const __D, int const __N) -{ - return _mm_insert_pi16 (__A, __D, __N); -} -#else -#define _mm_insert_pi16(A, D, N) \ - ((__m64) __builtin_ia32_vec_set_v4hi ((__v4hi)(__m64)(A), \ - (int)(D), (int)(N))) - -#define _m_pinsrw(A, D, N) _mm_insert_pi16(A, D, N) -#endif - -/* Compute the element-wise maximum of signed 16-bit values. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_pi16 (__m64 __A, __m64 __B) -{ - return (__m64) __builtin_ia32_pmaxsw ((__v4hi)__A, (__v4hi)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pmaxsw (__m64 __A, __m64 __B) -{ - return _mm_max_pi16 (__A, __B); -} - -/* Compute the element-wise maximum of unsigned 8-bit values. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_max_pu8 (__m64 __A, __m64 __B) -{ - return (__m64) __builtin_ia32_pmaxub ((__v8qi)__A, (__v8qi)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pmaxub (__m64 __A, __m64 __B) -{ - return _mm_max_pu8 (__A, __B); -} - -/* Compute the element-wise minimum of signed 16-bit values. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_pi16 (__m64 __A, __m64 __B) -{ - return (__m64) __builtin_ia32_pminsw ((__v4hi)__A, (__v4hi)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pminsw (__m64 __A, __m64 __B) -{ - return _mm_min_pi16 (__A, __B); -} - -/* Compute the element-wise minimum of unsigned 8-bit values. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_min_pu8 (__m64 __A, __m64 __B) -{ - return (__m64) __builtin_ia32_pminub ((__v8qi)__A, (__v8qi)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pminub (__m64 __A, __m64 __B) -{ - return _mm_min_pu8 (__A, __B); -} - -/* Create an 8-bit mask of the signs of 8-bit values. */ -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_movemask_pi8 (__m64 __A) -{ - return __builtin_ia32_pmovmskb ((__v8qi)__A); -} - -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pmovmskb (__m64 __A) -{ - return _mm_movemask_pi8 (__A); -} - -/* Multiply four unsigned 16-bit values in A by four unsigned 16-bit values - in B and produce the high 16 bits of the 32-bit results. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mulhi_pu16 (__m64 __A, __m64 __B) -{ - return (__m64) __builtin_ia32_pmulhuw ((__v4hi)__A, (__v4hi)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pmulhuw (__m64 __A, __m64 __B) -{ - return _mm_mulhi_pu16 (__A, __B); -} - -/* Return a combination of the four 16-bit values in A. The selector - must be an immediate. */ -#ifdef __OPTIMIZE__ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shuffle_pi16 (__m64 __A, int const __N) -{ - return (__m64) __builtin_ia32_pshufw ((__v4hi)__A, __N); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pshufw (__m64 __A, int const __N) -{ - return _mm_shuffle_pi16 (__A, __N); -} -#else -#define _mm_shuffle_pi16(A, N) \ - ((__m64) __builtin_ia32_pshufw ((__v4hi)(__m64)(A), (int)(N))) - -#define _m_pshufw(A, N) _mm_shuffle_pi16 (A, N) -#endif - -/* Conditionally store byte elements of A into P. The high bit of each - byte in the selector N determines whether the corresponding byte from - A is stored. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskmove_si64 (__m64 __A, __m64 __N, char *__P) -{ - __builtin_ia32_maskmovq ((__v8qi)__A, (__v8qi)__N, __P); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_maskmovq (__m64 __A, __m64 __N, char *__P) -{ - _mm_maskmove_si64 (__A, __N, __P); -} - -/* Compute the rounded averages of the unsigned 8-bit values in A and B. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_avg_pu8 (__m64 __A, __m64 __B) -{ - return (__m64) __builtin_ia32_pavgb ((__v8qi)__A, (__v8qi)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pavgb (__m64 __A, __m64 __B) -{ - return _mm_avg_pu8 (__A, __B); -} - -/* Compute the rounded averages of the unsigned 16-bit values in A and B. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_avg_pu16 (__m64 __A, __m64 __B) -{ - return (__m64) __builtin_ia32_pavgw ((__v4hi)__A, (__v4hi)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_pavgw (__m64 __A, __m64 __B) -{ - return _mm_avg_pu16 (__A, __B); -} - -/* Compute the sum of the absolute differences of the unsigned 8-bit - values in A and B. Return the value in the lower 16-bit word; the - upper words are cleared. */ -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sad_pu8 (__m64 __A, __m64 __B) -{ - return (__m64) __builtin_ia32_psadbw ((__v8qi)__A, (__v8qi)__B); -} - -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_m_psadbw (__m64 __A, __m64 __B) -{ - return _mm_sad_pu8 (__A, __B); -} - -/* Loads one cache line from address P to a location "closer" to the - processor. The selector I specifies the type of prefetch operation. */ -#ifdef __OPTIMIZE__ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_prefetch (const void *__P, enum _mm_hint __I) -{ - __builtin_prefetch (__P, 0, __I); -} -#else -#define _mm_prefetch(P, I) \ - __builtin_prefetch ((P), 0, (I)) -#endif - -/* Stores the data in A to the address P without polluting the caches. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_stream_pi (__m64 *__P, __m64 __A) -{ - __builtin_ia32_movntq ((unsigned long long *)__P, (unsigned long long)__A); -} - -/* Likewise. The address must be 16-byte aligned. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_stream_ps (float *__P, __m128 __A) -{ - __builtin_ia32_movntps (__P, (__v4sf)__A); -} - -/* Guarantees that every preceding store is globally visible before - any subsequent store. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sfence (void) -{ - __builtin_ia32_sfence (); -} - -/* The execution of the next instruction is delayed by an implementation - specific amount of time. The instruction does not modify the - architectural state. */ -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_pause (void) -{ - __builtin_ia32_pause (); -} - -/* Transpose the 4x4 matrix composed of row[0-3]. */ -#define _MM_TRANSPOSE4_PS(row0, row1, row2, row3) \ -do { \ - __v4sf __r0 = (row0), __r1 = (row1), __r2 = (row2), __r3 = (row3); \ - __v4sf __t0 = __builtin_ia32_unpcklps (__r0, __r1); \ - __v4sf __t1 = __builtin_ia32_unpcklps (__r2, __r3); \ - __v4sf __t2 = __builtin_ia32_unpckhps (__r0, __r1); \ - __v4sf __t3 = __builtin_ia32_unpckhps (__r2, __r3); \ - (row0) = __builtin_ia32_movlhps (__t0, __t1); \ - (row1) = __builtin_ia32_movhlps (__t1, __t0); \ - (row2) = __builtin_ia32_movlhps (__t2, __t3); \ - (row3) = __builtin_ia32_movhlps (__t3, __t2); \ -} while (0) - -/* For backward source compatibility. */ -#ifdef __SSE2__ -# include <emmintrin.h> -#endif - -#endif /* __SSE__ */ -#endif /* _XMMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/xopintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/xopintrin.h deleted file mode 100644 index d2a99a1..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/xopintrin.h +++ /dev/null @@ -1,839 +0,0 @@ -/* Copyright (C) 2007-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _X86INTRIN_H_INCLUDED -# error "Never use <xopintrin.h> directly; include <x86intrin.h> instead." -#endif - -#ifndef _XOPMMINTRIN_H_INCLUDED -#define _XOPMMINTRIN_H_INCLUDED - -#ifndef __XOP__ -# error "XOP instruction set not enabled" -#else - -#include <fma4intrin.h> - -/* Integer multiply/add intructions. */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maccs_epi16(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmacssww ((__v8hi)__A,(__v8hi)__B, (__v8hi)__C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_macc_epi16(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmacsww ((__v8hi)__A, (__v8hi)__B, (__v8hi)__C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maccsd_epi16(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmacsswd ((__v8hi)__A, (__v8hi)__B, (__v4si)__C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maccd_epi16(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmacswd ((__v8hi)__A, (__v8hi)__B, (__v4si)__C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maccs_epi32(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmacssdd ((__v4si)__A, (__v4si)__B, (__v4si)__C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_macc_epi32(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmacsdd ((__v4si)__A, (__v4si)__B, (__v4si)__C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maccslo_epi32(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmacssdql ((__v4si)__A, (__v4si)__B, (__v2di)__C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_macclo_epi32(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmacsdql ((__v4si)__A, (__v4si)__B, (__v2di)__C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maccshi_epi32(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmacssdqh ((__v4si)__A, (__v4si)__B, (__v2di)__C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_macchi_epi32(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmacsdqh ((__v4si)__A, (__v4si)__B, (__v2di)__C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maddsd_epi16(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmadcsswd ((__v8hi)__A,(__v8hi)__B,(__v4si)__C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maddd_epi16(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpmadcswd ((__v8hi)__A,(__v8hi)__B,(__v4si)__C); -} - -/* Packed Integer Horizontal Add and Subtract */ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddw_epi8(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphaddbw ((__v16qi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddd_epi8(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphaddbd ((__v16qi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddq_epi8(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphaddbq ((__v16qi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddd_epi16(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphaddwd ((__v8hi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddq_epi16(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphaddwq ((__v8hi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddq_epi32(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphadddq ((__v4si)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddw_epu8(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphaddubw ((__v16qi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddd_epu8(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphaddubd ((__v16qi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddq_epu8(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphaddubq ((__v16qi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddd_epu16(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphadduwd ((__v8hi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddq_epu16(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphadduwq ((__v8hi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_haddq_epu32(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphaddudq ((__v4si)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hsubw_epi8(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphsubbw ((__v16qi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hsubd_epi16(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphsubwd ((__v8hi)__A); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_hsubq_epi32(__m128i __A) -{ - return (__m128i) __builtin_ia32_vphsubdq ((__v4si)__A); -} - -/* Vector conditional move and permute */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_cmov_si128(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpcmov (__A, __B, __C); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C) -{ - return (__m128i) __builtin_ia32_vpperm ((__v16qi)__A, (__v16qi)__B, (__v16qi)__C); -} - -/* Packed Integer Rotates and Shifts - Rotates - Non-Immediate form */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_rot_epi8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vprotb ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_rot_epi16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vprotw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_rot_epi32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vprotd ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_rot_epi64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vprotq ((__v2di)__A, (__v2di)__B); -} - -/* Rotates - Immediate form */ - -#ifdef __OPTIMIZE__ -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_roti_epi8(__m128i __A, const int __B) -{ - return (__m128i) __builtin_ia32_vprotbi ((__v16qi)__A, __B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_roti_epi16(__m128i __A, const int __B) -{ - return (__m128i) __builtin_ia32_vprotwi ((__v8hi)__A, __B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_roti_epi32(__m128i __A, const int __B) -{ - return (__m128i) __builtin_ia32_vprotdi ((__v4si)__A, __B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_roti_epi64(__m128i __A, const int __B) -{ - return (__m128i) __builtin_ia32_vprotqi ((__v2di)__A, __B); -} -#else -#define _mm_roti_epi8(A, N) \ - ((__m128i) __builtin_ia32_vprotbi ((__v16qi)(__m128i)(A), (int)(N))) -#define _mm_roti_epi16(A, N) \ - ((__m128i) __builtin_ia32_vprotwi ((__v8hi)(__m128i)(A), (int)(N))) -#define _mm_roti_epi32(A, N) \ - ((__m128i) __builtin_ia32_vprotdi ((__v4si)(__m128i)(A), (int)(N))) -#define _mm_roti_epi64(A, N) \ - ((__m128i) __builtin_ia32_vprotqi ((__v2di)(__m128i)(A), (int)(N))) -#endif - -/* Shifts */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shl_epi8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpshlb ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shl_epi16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpshlw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shl_epi32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpshld ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_shl_epi64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpshlq ((__v2di)__A, (__v2di)__B); -} - - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sha_epi8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpshab ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sha_epi16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpshaw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sha_epi32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpshad ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_sha_epi64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpshaq ((__v2di)__A, (__v2di)__B); -} - -/* Compare and Predicate Generation - pcom (integer, unsinged bytes) */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comlt_epu8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomltub ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comle_epu8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomleub ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comgt_epu8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgtub ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comge_epu8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgeub ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comeq_epu8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomequb ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comneq_epu8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomnequb ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comfalse_epu8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomfalseub ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comtrue_epu8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomtrueub ((__v16qi)__A, (__v16qi)__B); -} - -/*pcom (integer, unsinged words) */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comlt_epu16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomltuw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comle_epu16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomleuw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comgt_epu16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgtuw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comge_epu16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgeuw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comeq_epu16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomequw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comneq_epu16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomnequw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comfalse_epu16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomfalseuw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comtrue_epu16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomtrueuw ((__v8hi)__A, (__v8hi)__B); -} - -/*pcom (integer, unsinged double words) */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comlt_epu32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomltud ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comle_epu32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomleud ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comgt_epu32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgtud ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comge_epu32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgeud ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comeq_epu32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomequd ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comneq_epu32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomnequd ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comfalse_epu32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomfalseud ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comtrue_epu32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomtrueud ((__v4si)__A, (__v4si)__B); -} - -/*pcom (integer, unsinged quad words) */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comlt_epu64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomltuq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comle_epu64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomleuq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comgt_epu64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgtuq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comge_epu64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgeuq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comeq_epu64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomequq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comneq_epu64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomnequq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comfalse_epu64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomfalseuq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comtrue_epu64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomtrueuq ((__v2di)__A, (__v2di)__B); -} - -/*pcom (integer, signed bytes) */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comlt_epi8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomltb ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comle_epi8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomleb ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comgt_epi8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgtb ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comge_epi8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgeb ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comeq_epi8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomeqb ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comneq_epi8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomneqb ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comfalse_epi8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomfalseb ((__v16qi)__A, (__v16qi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comtrue_epi8(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomtrueb ((__v16qi)__A, (__v16qi)__B); -} - -/*pcom (integer, signed words) */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comlt_epi16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomltw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comle_epi16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomlew ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comgt_epi16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgtw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comge_epi16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgew ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comeq_epi16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomeqw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comneq_epi16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomneqw ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comfalse_epi16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomfalsew ((__v8hi)__A, (__v8hi)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comtrue_epi16(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomtruew ((__v8hi)__A, (__v8hi)__B); -} - -/*pcom (integer, signed double words) */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comlt_epi32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomltd ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comle_epi32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomled ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comgt_epi32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgtd ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comge_epi32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomged ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comeq_epi32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomeqd ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comneq_epi32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomneqd ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comfalse_epi32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomfalsed ((__v4si)__A, (__v4si)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comtrue_epi32(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomtrued ((__v4si)__A, (__v4si)__B); -} - -/*pcom (integer, signed quad words) */ - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comlt_epi64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomltq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comle_epi64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomleq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comgt_epi64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgtq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comge_epi64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomgeq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comeq_epi64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomeqq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comneq_epi64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomneqq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comfalse_epi64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomfalseq ((__v2di)__A, (__v2di)__B); -} - -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_comtrue_epi64(__m128i __A, __m128i __B) -{ - return (__m128i) __builtin_ia32_vpcomtrueq ((__v2di)__A, (__v2di)__B); -} - -/* FRCZ */ - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_frcz_ps (__m128 __A) -{ - return (__m128) __builtin_ia32_vfrczps ((__v4sf)__A); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_frcz_pd (__m128d __A) -{ - return (__m128d) __builtin_ia32_vfrczpd ((__v2df)__A); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_frcz_ss (__m128 __A, __m128 __B) -{ - return (__m128) __builtin_ia32_movss ((__v4sf)__A, - (__v4sf) - __builtin_ia32_vfrczss ((__v4sf)__B)); -} - -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_frcz_sd (__m128d __A, __m128d __B) -{ - return (__m128d) __builtin_ia32_movsd ((__v2df)__A, - (__v2df) - __builtin_ia32_vfrczsd ((__v2df)__B)); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_frcz_ps (__m256 __A) -{ - return (__m256) __builtin_ia32_vfrczps256 ((__v8sf)__A); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_frcz_pd (__m256d __A) -{ - return (__m256d) __builtin_ia32_vfrczpd256 ((__v4df)__A); -} - -/* PERMIL2 */ - -#ifdef __OPTIMIZE__ -extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_permute2_pd (__m128d __X, __m128d __Y, __m128i __C, const int __I) -{ - return (__m128d) __builtin_ia32_vpermil2pd ((__v2df)__X, - (__v2df)__Y, - (__v2di)__C, - __I); -} - -extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permute2_pd (__m256d __X, __m256d __Y, __m256i __C, const int __I) -{ - return (__m256d) __builtin_ia32_vpermil2pd256 ((__v4df)__X, - (__v4df)__Y, - (__v4di)__C, - __I); -} - -extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_permute2_ps (__m128 __X, __m128 __Y, __m128i __C, const int __I) -{ - return (__m128) __builtin_ia32_vpermil2ps ((__v4sf)__X, - (__v4sf)__Y, - (__v4si)__C, - __I); -} - -extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm256_permute2_ps (__m256 __X, __m256 __Y, __m256i __C, const int __I) -{ - return (__m256) __builtin_ia32_vpermil2ps256 ((__v8sf)__X, - (__v8sf)__Y, - (__v8si)__C, - __I); -} -#else -#define _mm_permute2_pd(X, Y, C, I) \ - ((__m128d) __builtin_ia32_vpermil2pd ((__v2df)(__m128d)(X), \ - (__v2df)(__m128d)(Y), \ - (__v2di)(__m128d)(C), \ - (int)(I))) - -#define _mm256_permute2_pd(X, Y, C, I) \ - ((__m256d) __builtin_ia32_vpermil2pd256 ((__v4df)(__m256d)(X), \ - (__v4df)(__m256d)(Y), \ - (__v4di)(__m256d)(C), \ - (int)(I))) - -#define _mm_permute2_ps(X, Y, C, I) \ - ((__m128) __builtin_ia32_vpermil2ps ((__v4sf)(__m128)(X), \ - (__v4sf)(__m128)(Y), \ - (__v4si)(__m128)(C), \ - (int)(I))) - -#define _mm256_permute2_ps(X, Y, C, I) \ - ((__m256) __builtin_ia32_vpermil2ps256 ((__v8sf)(__m256)(X), \ - (__v8sf)(__m256)(Y), \ - (__v8si)(__m256)(C), \ - (int)(I))) -#endif /* __OPTIMIZE__ */ - -#endif /* __XOP__ */ - -#endif /* _XOPMMINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/xsaveintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/xsaveintrin.h deleted file mode 100644 index f566589..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/xsaveintrin.h +++ /dev/null @@ -1,61 +0,0 @@ -/* Copyright (C) 2012-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* #if !defined _X86INTRIN_H_INCLUDED && !defined _IMMINTRIN_H_INCLUDED */ -/* # error "Never use <xsaveintrin.h> directly; include <x86intrin.h> instead." */ -/* #endif */ - -#ifndef _XSAVEINTRIN_H_INCLUDED -#define _XSAVEINTRIN_H_INCLUDED - -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_xsave (void *__P, long long __M) -{ - return __builtin_ia32_xsave (__P, __M); -} - -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_xrstor (void *__P, long long __M) -{ - return __builtin_ia32_xrstor (__P, __M); -} - -#ifdef __x86_64__ -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_xsave64 (void *__P, long long __M) -{ - return __builtin_ia32_xsave64 (__P, __M); -} - -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_xrstor64 (void *__P, long long __M) -{ - return __builtin_ia32_xrstor64 (__P, __M); -} -#endif - -#endif /* _XSAVEINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/xsaveoptintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/xsaveoptintrin.h deleted file mode 100644 index 0d73e34..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/xsaveoptintrin.h +++ /dev/null @@ -1,47 +0,0 @@ -/* Copyright (C) 2012-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -/* #if !defined _X86INTRIN_H_INCLUDED && !defined _IMMINTRIN_H_INCLUDED */ -/* # error "Never use <xsaveoptintrin.h> directly; include <x86intrin.h> instead." */ -/* #endif */ - -#ifndef _XSAVEOPTINTRIN_H_INCLUDED -#define _XSAVEOPTINTRIN_H_INCLUDED - -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_xsaveopt (void *__P, long long __M) -{ - return __builtin_ia32_xsaveopt (__P, __M); -} - -#ifdef __x86_64__ -extern __inline void -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_xsaveopt64 (void *__P, long long __M) -{ - return __builtin_ia32_xsaveopt64 (__P, __M); -} -#endif - -#endif /* _XSAVEOPTINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/include/xtestintrin.h b/lib/gcc/x86_64-linux-android/4.8/include/xtestintrin.h deleted file mode 100644 index c82fb7a..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/include/xtestintrin.h +++ /dev/null @@ -1,44 +0,0 @@ -/* Copyright (C) 2012-2013 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#ifndef _IMMINTRIN_H_INCLUDED -# error "Never use <xtestintrin.h> directly; include <immintrin.h> instead." -#endif - -#ifndef __RTM__ -# error "RTM instruction set not enabled" -#endif /* __RTM__ */ - -#ifndef _XTESTINTRIN_H_INCLUDED -#define _XTESTINTRIN_H_INCLUDED - -/* Return non-zero if the instruction executes inside an RTM or HLE code - region. Return zero otherwise. */ -extern __inline int -__attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_xtest (void) -{ - return __builtin_ia32_xtest (); -} - -#endif /* _XTESTINTRIN_H_INCLUDED */ diff --git a/lib/gcc/x86_64-linux-android/4.8/libgcc.a b/lib/gcc/x86_64-linux-android/4.8/libgcc.a Binary files differdeleted file mode 100644 index 9decb55..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/libgcc.a +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/libgcov.a b/lib/gcc/x86_64-linux-android/4.8/libgcov.a Binary files differdeleted file mode 100644 index ef31d49..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/libgcov.a +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/x32/crtbegin.o b/lib/gcc/x86_64-linux-android/4.8/x32/crtbegin.o Binary files differdeleted file mode 100644 index 5eaa128..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/x32/crtbegin.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/x32/crtbeginS.o b/lib/gcc/x86_64-linux-android/4.8/x32/crtbeginS.o Binary files differdeleted file mode 100644 index 91a4ca7..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/x32/crtbeginS.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/x32/crtbeginT.o b/lib/gcc/x86_64-linux-android/4.8/x32/crtbeginT.o Binary files differdeleted file mode 100644 index 5eaa128..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/x32/crtbeginT.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/x32/crtend.o b/lib/gcc/x86_64-linux-android/4.8/x32/crtend.o Binary files differdeleted file mode 100644 index b2e1d26..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/x32/crtend.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/x32/crtendS.o b/lib/gcc/x86_64-linux-android/4.8/x32/crtendS.o Binary files differdeleted file mode 100644 index b2e1d26..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/x32/crtendS.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/x32/crtfastmath.o b/lib/gcc/x86_64-linux-android/4.8/x32/crtfastmath.o Binary files differdeleted file mode 100644 index a4dfe40..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/x32/crtfastmath.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/x32/crtprec32.o b/lib/gcc/x86_64-linux-android/4.8/x32/crtprec32.o Binary files differdeleted file mode 100644 index 16a61a1..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/x32/crtprec32.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/x32/crtprec64.o b/lib/gcc/x86_64-linux-android/4.8/x32/crtprec64.o Binary files differdeleted file mode 100644 index dcd675c..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/x32/crtprec64.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/x32/crtprec80.o b/lib/gcc/x86_64-linux-android/4.8/x32/crtprec80.o Binary files differdeleted file mode 100644 index ec8a919..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/x32/crtprec80.o +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/x32/libgcc.a b/lib/gcc/x86_64-linux-android/4.8/x32/libgcc.a Binary files differdeleted file mode 100644 index 28fda5f..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/x32/libgcc.a +++ /dev/null diff --git a/lib/gcc/x86_64-linux-android/4.8/x32/libgcov.a b/lib/gcc/x86_64-linux-android/4.8/x32/libgcov.a Binary files differdeleted file mode 100644 index 1e755ec..0000000 --- a/lib/gcc/x86_64-linux-android/4.8/x32/libgcov.a +++ /dev/null diff --git a/lib/x86_64/libiberty.a b/lib/x86_64/libiberty.a Binary files differdeleted file mode 100644 index b81ce95..0000000 --- a/lib/x86_64/libiberty.a +++ /dev/null diff --git a/libexec/gcc/x86_64-linux-android/4.8/cc1 b/libexec/gcc/x86_64-linux-android/4.8/cc1 Binary files differdeleted file mode 100755 index 636d105..0000000 --- a/libexec/gcc/x86_64-linux-android/4.8/cc1 +++ /dev/null diff --git a/libexec/gcc/x86_64-linux-android/4.8/cc1plus b/libexec/gcc/x86_64-linux-android/4.8/cc1plus Binary files differdeleted file mode 100755 index 524da93..0000000 --- a/libexec/gcc/x86_64-linux-android/4.8/cc1plus +++ /dev/null diff --git a/libexec/gcc/x86_64-linux-android/4.8/collect2 b/libexec/gcc/x86_64-linux-android/4.8/collect2 Binary files differdeleted file mode 100755 index 1720a2f..0000000 --- a/libexec/gcc/x86_64-linux-android/4.8/collect2 +++ /dev/null diff --git a/libexec/gcc/x86_64-linux-android/4.8/liblto_plugin.0.so b/libexec/gcc/x86_64-linux-android/4.8/liblto_plugin.0.so Binary files differdeleted file mode 100755 index 615eebc..0000000 --- a/libexec/gcc/x86_64-linux-android/4.8/liblto_plugin.0.so +++ /dev/null diff --git a/libexec/gcc/x86_64-linux-android/4.8/liblto_plugin.so b/libexec/gcc/x86_64-linux-android/4.8/liblto_plugin.so deleted file mode 120000 index 3c0440e..0000000 --- a/libexec/gcc/x86_64-linux-android/4.8/liblto_plugin.so +++ /dev/null @@ -1 +0,0 @@ -liblto_plugin.0.so
\ No newline at end of file diff --git a/libexec/gcc/x86_64-linux-android/4.8/lto-wrapper b/libexec/gcc/x86_64-linux-android/4.8/lto-wrapper Binary files differdeleted file mode 100755 index 0b65476..0000000 --- a/libexec/gcc/x86_64-linux-android/4.8/lto-wrapper +++ /dev/null diff --git a/libexec/gcc/x86_64-linux-android/4.8/lto1 b/libexec/gcc/x86_64-linux-android/4.8/lto1 Binary files differdeleted file mode 100755 index ab4e503..0000000 --- a/libexec/gcc/x86_64-linux-android/4.8/lto1 +++ /dev/null diff --git a/libexec/gcc/x86_64-linux-android/4.8/plugin/gengtype b/libexec/gcc/x86_64-linux-android/4.8/plugin/gengtype Binary files differdeleted file mode 100755 index dfc59a7..0000000 --- a/libexec/gcc/x86_64-linux-android/4.8/plugin/gengtype +++ /dev/null diff --git a/share/gdb/python/gdb/__init__.py b/share/gdb/python/gdb/__init__.py deleted file mode 100644 index 6311583..0000000 --- a/share/gdb/python/gdb/__init__.py +++ /dev/null @@ -1,124 +0,0 @@ -# Copyright (C) 2010-2013 Free Software Foundation, Inc. - -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see <http://www.gnu.org/licenses/>. - -import traceback -import os -import sys -import _gdb - -if sys.version_info[0] > 2: - # Python 3 moved "reload" - from imp import reload - -from _gdb import * - -class _GdbFile (object): - # These two are needed in Python 3 - encoding = "UTF-8" - errors = "strict" - - def close(self): - # Do nothing. - return None - - def isatty(self): - return False - - def writelines(self, iterable): - for line in iterable: - self.write(line) - - def flush(self): - flush() - -class GdbOutputFile (_GdbFile): - def write(self, s): - write(s, stream=STDOUT) - -sys.stdout = GdbOutputFile() - -class GdbOutputErrorFile (_GdbFile): - def write(self, s): - write(s, stream=STDERR) - -sys.stderr = GdbOutputErrorFile() - -# Default prompt hook does nothing. -prompt_hook = None - -# Ensure that sys.argv is set to something. -# We do not use PySys_SetArgvEx because it did not appear until 2.6.6. -sys.argv = [''] - -# Initial pretty printers. -pretty_printers = [] - -# Initial type printers. -type_printers = [] - -# Convenience variable to GDB's python directory -PYTHONDIR = os.path.dirname(os.path.dirname(__file__)) - -# Auto-load all functions/commands. - -# Packages to auto-load. - -packages = [ - 'function', - 'command' -] - -# pkgutil.iter_modules is not available prior to Python 2.6. Instead, -# manually iterate the list, collating the Python files in each module -# path. Construct the module name, and import. - -def auto_load_packages(): - for package in packages: - location = os.path.join(os.path.dirname(__file__), package) - if os.path.exists(location): - py_files = filter(lambda x: x.endswith('.py') - and x != '__init__.py', - os.listdir(location)) - - for py_file in py_files: - # Construct from foo.py, gdb.module.foo - modname = "%s.%s.%s" % ( __name__, package, py_file[:-3] ) - try: - if modname in sys.modules: - # reload modules with duplicate names - reload(__import__(modname)) - else: - __import__(modname) - except: - sys.stderr.write (traceback.format_exc() + "\n") - -auto_load_packages() - -def GdbSetPythonDirectory(dir): - """Update sys.path, reload gdb and auto-load packages.""" - global PYTHONDIR - - try: - sys.path.remove(PYTHONDIR) - except ValueError: - pass - sys.path.insert(0, dir) - - PYTHONDIR = dir - - # note that reload overwrites the gdb module without deleting existing - # attributes - reload(__import__(__name__)) - auto_load_packages() diff --git a/share/gdb/python/gdb/command/__init__.py b/share/gdb/python/gdb/command/__init__.py deleted file mode 100644 index 21eaef8..0000000 --- a/share/gdb/python/gdb/command/__init__.py +++ /dev/null @@ -1,16 +0,0 @@ -# Copyright (C) 2010-2013 Free Software Foundation, Inc. - -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see <http://www.gnu.org/licenses/>. - - diff --git a/share/gdb/python/gdb/command/explore.py b/share/gdb/python/gdb/command/explore.py deleted file mode 100644 index dd77875..0000000 --- a/share/gdb/python/gdb/command/explore.py +++ /dev/null @@ -1,760 +0,0 @@ -# GDB 'explore' command. -# Copyright (C) 2012-2013 Free Software Foundation, Inc. - -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see <http://www.gnu.org/licenses/>. - -"""Implementation of the GDB 'explore' command using the GDB Python API.""" - -import gdb -import sys - -if sys.version_info[0] > 2: - # Python 3 renamed raw_input to input - raw_input = input - -class Explorer(object): - """Internal class which invokes other explorers.""" - - # This map is filled by the Explorer.init_env() function - type_code_to_explorer_map = { } - - _SCALAR_TYPE_LIST = ( - gdb.TYPE_CODE_CHAR, - gdb.TYPE_CODE_INT, - gdb.TYPE_CODE_BOOL, - gdb.TYPE_CODE_FLT, - gdb.TYPE_CODE_VOID, - gdb.TYPE_CODE_ENUM, - ) - - @staticmethod - def guard_expr(expr): - length = len(expr) - guard = False - - if expr[0] == '(' and expr[length-1] == ')': - pass - else: - i = 0 - while i < length: - c = expr[i] - if (c == '_' or ('a' <= c and c <= 'z') or - ('A' <= c and c <= 'Z') or ('0' <= c and c <= '9')): - pass - else: - guard = True - break - i += 1 - - if guard: - return "(" + expr + ")" - else: - return expr - - @staticmethod - def explore_expr(expr, value, is_child): - """Main function to explore an expression value. - - Arguments: - expr: The expression string that is being explored. - value: The gdb.Value value of the expression. - is_child: Boolean value to indicate if the expression is a child. - An expression is a child if it is derived from the main - expression entered by the user. For example, if the user - entered an expression which evaluates to a struct, then - when exploring the fields of the struct, is_child is set - to True internally. - - Returns: - No return value. - """ - type_code = value.type.code - if type_code in Explorer.type_code_to_explorer_map: - explorer_class = Explorer.type_code_to_explorer_map[type_code] - while explorer_class.explore_expr(expr, value, is_child): - pass - else: - print ("Explorer for type '%s' not yet available.\n" % - str(value.type)) - - @staticmethod - def explore_type(name, datatype, is_child): - """Main function to explore a data type. - - Arguments: - name: The string representing the path to the data type being - explored. - datatype: The gdb.Type value of the data type being explored. - is_child: Boolean value to indicate if the name is a child. - A name is a child if it is derived from the main name - entered by the user. For example, if the user entered - the name of struct type, then when exploring the fields - of the struct, is_child is set to True internally. - - Returns: - No return value. - """ - type_code = datatype.code - if type_code in Explorer.type_code_to_explorer_map: - explorer_class = Explorer.type_code_to_explorer_map[type_code] - while explorer_class.explore_type(name, datatype, is_child): - pass - else: - print ("Explorer for type '%s' not yet available.\n" % - str(datatype)) - - @staticmethod - def init_env(): - """Initializes the Explorer environment. - This function should be invoked before starting any exploration. If - invoked before an exploration, it need not be invoked for subsequent - explorations. - """ - Explorer.type_code_to_explorer_map = { - gdb.TYPE_CODE_CHAR : ScalarExplorer, - gdb.TYPE_CODE_INT : ScalarExplorer, - gdb.TYPE_CODE_BOOL : ScalarExplorer, - gdb.TYPE_CODE_FLT : ScalarExplorer, - gdb.TYPE_CODE_VOID : ScalarExplorer, - gdb.TYPE_CODE_ENUM : ScalarExplorer, - gdb.TYPE_CODE_STRUCT : CompoundExplorer, - gdb.TYPE_CODE_UNION : CompoundExplorer, - gdb.TYPE_CODE_PTR : PointerExplorer, - gdb.TYPE_CODE_REF : ReferenceExplorer, - gdb.TYPE_CODE_TYPEDEF : TypedefExplorer, - gdb.TYPE_CODE_ARRAY : ArrayExplorer - } - - @staticmethod - def is_scalar_type(type): - """Checks whether a type is a scalar type. - A type is a scalar type of its type is - gdb.TYPE_CODE_CHAR or - gdb.TYPE_CODE_INT or - gdb.TYPE_CODE_BOOL or - gdb.TYPE_CODE_FLT or - gdb.TYPE_CODE_VOID or - gdb.TYPE_CODE_ENUM. - - Arguments: - type: The type to be checked. - - Returns: - 'True' if 'type' is a scalar type. 'False' otherwise. - """ - return type.code in Explorer._SCALAR_TYPE_LIST - - @staticmethod - def return_to_parent_value(): - """A utility function which prints that the current exploration session - is returning to the parent value. Useful when exploring values. - """ - print ("\nReturning to parent value...\n") - - @staticmethod - def return_to_parent_value_prompt(): - """A utility function which prompts the user to press the 'enter' key - so that the exploration session can shift back to the parent value. - Useful when exploring values. - """ - raw_input("\nPress enter to return to parent value: ") - - @staticmethod - def return_to_enclosing_type(): - """A utility function which prints that the current exploration session - is returning to the enclosing type. Useful when exploring types. - """ - print ("\nReturning to enclosing type...\n") - - @staticmethod - def return_to_enclosing_type_prompt(): - """A utility function which prompts the user to press the 'enter' key - so that the exploration session can shift back to the enclosing type. - Useful when exploring types. - """ - raw_input("\nPress enter to return to enclosing type: ") - - -class ScalarExplorer(object): - """Internal class used to explore scalar values.""" - - @staticmethod - def explore_expr(expr, value, is_child): - """Function to explore scalar values. - See Explorer.explore_expr and Explorer.is_scalar_type for more - information. - """ - print ("'%s' is a scalar value of type '%s'." % - (expr, value.type)) - print ("%s = %s" % (expr, str(value))) - - if is_child: - Explorer.return_to_parent_value_prompt() - Explorer.return_to_parent_value() - - return False - - @staticmethod - def explore_type(name, datatype, is_child): - """Function to explore scalar types. - See Explorer.explore_type and Explorer.is_scalar_type for more - information. - """ - if datatype.code == gdb.TYPE_CODE_ENUM: - if is_child: - print ("%s is of an enumerated type '%s'." % - (name, str(datatype))) - else: - print ("'%s' is an enumerated type." % name) - else: - if is_child: - print ("%s is of a scalar type '%s'." % - (name, str(datatype))) - else: - print ("'%s' is a scalar type." % name) - - if is_child: - Explorer.return_to_enclosing_type_prompt() - Explorer.return_to_enclosing_type() - - return False - - -class PointerExplorer(object): - """Internal class used to explore pointer values.""" - - @staticmethod - def explore_expr(expr, value, is_child): - """Function to explore pointer values. - See Explorer.explore_expr for more information. - """ - print ("'%s' is a pointer to a value of type '%s'" % - (expr, str(value.type.target()))) - option = raw_input("Continue exploring it as a pointer to a single " - "value [y/n]: ") - if option == "y": - deref_value = None - try: - deref_value = value.dereference() - str(deref_value) - except gdb.MemoryError: - print ("'%s' a pointer pointing to an invalid memory " - "location." % expr) - if is_child: - Explorer.return_to_parent_value_prompt() - return False - Explorer.explore_expr("*%s" % Explorer.guard_expr(expr), - deref_value, is_child) - return False - - option = raw_input("Continue exploring it as a pointer to an " - "array [y/n]: ") - if option == "y": - while True: - index = 0 - try: - index = int(raw_input("Enter the index of the element you " - "want to explore in '%s': " % expr)) - except ValueError: - break - element_expr = "%s[%d]" % (Explorer.guard_expr(expr), index) - element = value[index] - try: - str(element) - except gdb.MemoryError: - print ("Cannot read value at index %d." % index) - continue - Explorer.explore_expr(element_expr, element, True) - return False - - if is_child: - Explorer.return_to_parent_value() - return False - - @staticmethod - def explore_type(name, datatype, is_child): - """Function to explore pointer types. - See Explorer.explore_type for more information. - """ - target_type = datatype.target() - print ("\n%s is a pointer to a value of type '%s'." % - (name, str(target_type))) - - Explorer.explore_type("the pointee type of %s" % name, - target_type, - is_child) - return False - - -class ReferenceExplorer(object): - """Internal class used to explore reference (TYPE_CODE_REF) values.""" - - @staticmethod - def explore_expr(expr, value, is_child): - """Function to explore array values. - See Explorer.explore_expr for more information. - """ - referenced_value = value.referenced_value() - Explorer.explore_expr(expr, referenced_value, is_child) - return False - - @staticmethod - def explore_type(name, datatype, is_child): - """Function to explore pointer types. - See Explorer.explore_type for more information. - """ - target_type = datatype.target() - Explorer.explore_type(name, target_type, is_child) - return False - - -class ArrayExplorer(object): - """Internal class used to explore arrays.""" - - @staticmethod - def explore_expr(expr, value, is_child): - """Function to explore array values. - See Explorer.explore_expr for more information. - """ - target_type = value.type.target() - print ("'%s' is an array of '%s'." % (expr, str(target_type))) - index = 0 - try: - index = int(raw_input("Enter the index of the element you want to " - "explore in '%s': " % expr)) - except ValueError: - if is_child: - Explorer.return_to_parent_value() - return False - - element = None - try: - element = value[index] - str(element) - except gdb.MemoryError: - print ("Cannot read value at index %d." % index) - raw_input("Press enter to continue... ") - return True - - Explorer.explore_expr("%s[%d]" % (Explorer.guard_expr(expr), index), - element, True) - return True - - @staticmethod - def explore_type(name, datatype, is_child): - """Function to explore array types. - See Explorer.explore_type for more information. - """ - target_type = datatype.target() - print ("%s is an array of '%s'." % (name, str(target_type))) - - Explorer.explore_type("the array element of %s" % name, target_type, - is_child) - return False - - -class CompoundExplorer(object): - """Internal class used to explore struct, classes and unions.""" - - @staticmethod - def _print_fields(print_list): - """Internal function which prints the fields of a struct/class/union. - """ - max_field_name_length = 0 - for pair in print_list: - if max_field_name_length < len(pair[0]): - max_field_name_length = len(pair[0]) - - for pair in print_list: - print (" %*s = %s" % (max_field_name_length, pair[0], pair[1])) - - @staticmethod - def _get_real_field_count(fields): - real_field_count = 0; - for field in fields: - if not field.artificial: - real_field_count = real_field_count + 1 - - return real_field_count - - @staticmethod - def explore_expr(expr, value, is_child): - """Function to explore structs/classes and union values. - See Explorer.explore_expr for more information. - """ - datatype = value.type - type_code = datatype.code - fields = datatype.fields() - - if type_code == gdb.TYPE_CODE_STRUCT: - type_desc = "struct/class" - else: - type_desc = "union" - - if CompoundExplorer._get_real_field_count(fields) == 0: - print ("The value of '%s' is a %s of type '%s' with no fields." % - (expr, type_desc, str(value.type))) - if is_child: - Explorer.return_to_parent_value_prompt() - return False - - print ("The value of '%s' is a %s of type '%s' with the following " - "fields:\n" % (expr, type_desc, str(value.type))) - - has_explorable_fields = False - choice_to_compound_field_map = { } - current_choice = 0 - print_list = [ ] - for field in fields: - if field.artificial: - continue - field_full_name = Explorer.guard_expr(expr) + "." + field.name - if field.is_base_class: - field_value = value.cast(field.type) - else: - field_value = value[field.name] - literal_value = "" - if type_code == gdb.TYPE_CODE_UNION: - literal_value = ("<Enter %d to explore this field of type " - "'%s'>" % (current_choice, str(field.type))) - has_explorable_fields = True - else: - if Explorer.is_scalar_type(field.type): - literal_value = ("%s .. (Value of type '%s')" % - (str(field_value), str(field.type))) - else: - if field.is_base_class: - field_desc = "base class" - else: - field_desc = "field" - literal_value = ("<Enter %d to explore this %s of type " - "'%s'>" % - (current_choice, field_desc, - str(field.type))) - has_explorable_fields = True - - choice_to_compound_field_map[str(current_choice)] = ( - field_full_name, field_value) - current_choice = current_choice + 1 - - print_list.append((field.name, literal_value)) - - CompoundExplorer._print_fields(print_list) - print ("") - - if has_explorable_fields: - choice = raw_input("Enter the field number of choice: ") - if choice in choice_to_compound_field_map: - Explorer.explore_expr(choice_to_compound_field_map[choice][0], - choice_to_compound_field_map[choice][1], - True) - return True - else: - if is_child: - Explorer.return_to_parent_value() - else: - if is_child: - Explorer.return_to_parent_value_prompt() - - return False - - @staticmethod - def explore_type(name, datatype, is_child): - """Function to explore struct/class and union types. - See Explorer.explore_type for more information. - """ - type_code = datatype.code - type_desc = "" - if type_code == gdb.TYPE_CODE_STRUCT: - type_desc = "struct/class" - else: - type_desc = "union" - - fields = datatype.fields() - if CompoundExplorer._get_real_field_count(fields) == 0: - if is_child: - print ("%s is a %s of type '%s' with no fields." % - (name, type_desc, str(datatype))) - Explorer.return_to_enclosing_type_prompt() - else: - print ("'%s' is a %s with no fields." % (name, type_desc)) - return False - - if is_child: - print ("%s is a %s of type '%s' " - "with the following fields:\n" % - (name, type_desc, str(datatype))) - else: - print ("'%s' is a %s with the following " - "fields:\n" % - (name, type_desc)) - - has_explorable_fields = False - current_choice = 0 - choice_to_compound_field_map = { } - print_list = [ ] - for field in fields: - if field.artificial: - continue - if field.is_base_class: - field_desc = "base class" - else: - field_desc = "field" - rhs = ("<Enter %d to explore this %s of type '%s'>" % - (current_choice, field_desc, str(field.type))) - print_list.append((field.name, rhs)) - choice_to_compound_field_map[str(current_choice)] = ( - field.name, field.type, field_desc) - current_choice = current_choice + 1 - - CompoundExplorer._print_fields(print_list) - print ("") - - if len(choice_to_compound_field_map) > 0: - choice = raw_input("Enter the field number of choice: ") - if choice in choice_to_compound_field_map: - if is_child: - new_name = ("%s '%s' of %s" % - (choice_to_compound_field_map[choice][2], - choice_to_compound_field_map[choice][0], - name)) - else: - new_name = ("%s '%s' of '%s'" % - (choice_to_compound_field_map[choice][2], - choice_to_compound_field_map[choice][0], - name)) - Explorer.explore_type(new_name, - choice_to_compound_field_map[choice][1], True) - return True - else: - if is_child: - Explorer.return_to_enclosing_type() - else: - if is_child: - Explorer.return_to_enclosing_type_prompt() - - return False - - -class TypedefExplorer(object): - """Internal class used to explore values whose type is a typedef.""" - - @staticmethod - def explore_expr(expr, value, is_child): - """Function to explore typedef values. - See Explorer.explore_expr for more information. - """ - actual_type = value.type.strip_typedefs() - print ("The value of '%s' is of type '%s' " - "which is a typedef of type '%s'" % - (expr, str(value.type), str(actual_type))) - - Explorer.explore_expr(expr, value.cast(actual_type), is_child) - return False - - @staticmethod - def explore_type(name, datatype, is_child): - """Function to explore typedef types. - See Explorer.explore_type for more information. - """ - actual_type = datatype.strip_typedefs() - if is_child: - print ("The type of %s is a typedef of type '%s'." % - (name, str(actual_type))) - else: - print ("The type '%s' is a typedef of type '%s'." % - (name, str(actual_type))) - - Explorer.explore_type(name, actual_type, is_child) - return False - - -class ExploreUtils(object): - """Internal class which provides utilities for the main command classes.""" - - @staticmethod - def check_args(name, arg_str): - """Utility to check if adequate number of arguments are passed to an - explore command. - - Arguments: - name: The name of the explore command. - arg_str: The argument string passed to the explore command. - - Returns: - True if adequate arguments are passed, false otherwise. - - Raises: - gdb.GdbError if adequate arguments are not passed. - """ - if len(arg_str) < 1: - raise gdb.GdbError("ERROR: '%s' requires an argument." - % name) - return False - else: - return True - - @staticmethod - def get_type_from_str(type_str): - """A utility function to deduce the gdb.Type value from a string - representing the type. - - Arguments: - type_str: The type string from which the gdb.Type value should be - deduced. - - Returns: - The deduced gdb.Type value if possible, None otherwise. - """ - try: - # Assume the current language to be C/C++ and make a try. - return gdb.parse_and_eval("(%s *)0" % type_str).type.target() - except RuntimeError: - # If assumption of current language to be C/C++ was wrong, then - # lookup the type using the API. - try: - return gdb.lookup_type(type_str) - except RuntimeError: - return None - - @staticmethod - def get_value_from_str(value_str): - """A utility function to deduce the gdb.Value value from a string - representing the value. - - Arguments: - value_str: The value string from which the gdb.Value value should - be deduced. - - Returns: - The deduced gdb.Value value if possible, None otherwise. - """ - try: - return gdb.parse_and_eval(value_str) - except RuntimeError: - return None - - -class ExploreCommand(gdb.Command): - """Explore a value or a type valid in the current context. - - Usage: - - explore ARG - - - ARG is either a valid expression or a type name. - - At any stage of exploration, hit the return key (instead of a - choice, if any) to return to the enclosing type or value. - """ - - def __init__(self): - super(ExploreCommand, self).__init__(name = "explore", - command_class = gdb.COMMAND_DATA, - prefix = True) - - def invoke(self, arg_str, from_tty): - if ExploreUtils.check_args("explore", arg_str) == False: - return - - # Check if it is a value - value = ExploreUtils.get_value_from_str(arg_str) - if value is not None: - Explorer.explore_expr(arg_str, value, False) - return - - # If it is not a value, check if it is a type - datatype = ExploreUtils.get_type_from_str(arg_str) - if datatype is not None: - Explorer.explore_type(arg_str, datatype, False) - return - - # If it is neither a value nor a type, raise an error. - raise gdb.GdbError( - ("'%s' neither evaluates to a value nor is a type " - "in the current context." % - arg_str)) - - -class ExploreValueCommand(gdb.Command): - """Explore value of an expression valid in the current context. - - Usage: - - explore value ARG - - - ARG is a valid expression. - - At any stage of exploration, hit the return key (instead of a - choice, if any) to return to the enclosing value. - """ - - def __init__(self): - super(ExploreValueCommand, self).__init__( - name = "explore value", command_class = gdb.COMMAND_DATA) - - def invoke(self, arg_str, from_tty): - if ExploreUtils.check_args("explore value", arg_str) == False: - return - - value = ExploreUtils.get_value_from_str(arg_str) - if value is None: - raise gdb.GdbError( - (" '%s' does not evaluate to a value in the current " - "context." % - arg_str)) - return - - Explorer.explore_expr(arg_str, value, False) - - -class ExploreTypeCommand(gdb.Command): - """Explore a type or the type of an expression valid in the current - context. - - Usage: - - explore type ARG - - - ARG is a valid expression or a type name. - - At any stage of exploration, hit the return key (instead of a - choice, if any) to return to the enclosing type. - """ - - def __init__(self): - super(ExploreTypeCommand, self).__init__( - name = "explore type", command_class = gdb.COMMAND_DATA) - - def invoke(self, arg_str, from_tty): - if ExploreUtils.check_args("explore type", arg_str) == False: - return - - datatype = ExploreUtils.get_type_from_str(arg_str) - if datatype is not None: - Explorer.explore_type(arg_str, datatype, False) - return - - value = ExploreUtils.get_value_from_str(arg_str) - if value is not None: - print ("'%s' is of type '%s'." % (arg_str, str(value.type))) - Explorer.explore_type(str(value.type), value.type, False) - return - - raise gdb.GdbError(("'%s' is not a type or value in the current " - "context." % arg_str)) - - -Explorer.init_env() - -ExploreCommand() -ExploreValueCommand() -ExploreTypeCommand() diff --git a/share/gdb/python/gdb/command/pretty_printers.py b/share/gdb/python/gdb/command/pretty_printers.py deleted file mode 100644 index 7b03e3a..0000000 --- a/share/gdb/python/gdb/command/pretty_printers.py +++ /dev/null @@ -1,368 +0,0 @@ -# Pretty-printer commands. -# Copyright (C) 2010-2013 Free Software Foundation, Inc. - -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see <http://www.gnu.org/licenses/>. - -"""GDB commands for working with pretty-printers.""" - -import copy -import gdb -import re - - -def parse_printer_regexps(arg): - """Internal utility to parse a pretty-printer command argv. - - Arguments: - arg: The arguments to the command. The format is: - [object-regexp [name-regexp]]. - Individual printers in a collection are named as - printer-name;subprinter-name. - - Returns: - The result is a 3-tuple of compiled regular expressions, except that - the resulting compiled subprinter regexp is None if not provided. - - Raises: - SyntaxError: an error processing ARG - """ - - argv = gdb.string_to_argv(arg); - argc = len(argv) - object_regexp = "" # match everything - name_regexp = "" # match everything - subname_regexp = None - if argc > 3: - raise SyntaxError("too many arguments") - if argc >= 1: - object_regexp = argv[0] - if argc >= 2: - name_subname = argv[1].split(";", 1) - name_regexp = name_subname[0] - if len(name_subname) == 2: - subname_regexp = name_subname[1] - # That re.compile raises SyntaxError was determined empirically. - # We catch it and reraise it to provide a slightly more useful - # error message for the user. - try: - object_re = re.compile(object_regexp) - except SyntaxError: - raise SyntaxError("invalid object regexp: %s" % object_regexp) - try: - name_re = re.compile (name_regexp) - except SyntaxError: - raise SyntaxError("invalid name regexp: %s" % name_regexp) - if subname_regexp is not None: - try: - subname_re = re.compile(subname_regexp) - except SyntaxError: - raise SyntaxError("invalid subname regexp: %s" % subname_regexp) - else: - subname_re = None - return(object_re, name_re, subname_re) - - -def printer_enabled_p(printer): - """Internal utility to see if printer (or subprinter) is enabled.""" - if hasattr(printer, "enabled"): - return printer.enabled - else: - return True - - -class InfoPrettyPrinter(gdb.Command): - """GDB command to list all registered pretty-printers. - - Usage: info pretty-printer [object-regexp [name-regexp]] - - OBJECT-REGEXP is a regular expression matching the objects to list. - Objects are "global", the program space's file, and the objfiles within - that program space. - - NAME-REGEXP matches the name of the pretty-printer. - Individual printers in a collection are named as - printer-name;subprinter-name. - """ - - def __init__ (self): - super(InfoPrettyPrinter, self).__init__("info pretty-printer", - gdb.COMMAND_DATA) - - @staticmethod - def enabled_string(printer): - """Return "" if PRINTER is enabled, otherwise " [disabled]".""" - if printer_enabled_p(printer): - return "" - else: - return " [disabled]" - - @staticmethod - def printer_name(printer): - """Return the printer's name.""" - if hasattr(printer, "name"): - return printer.name - if hasattr(printer, "__name__"): - return printer.__name__ - # This "shouldn't happen", but the public API allows for - # direct additions to the pretty-printer list, and we shouldn't - # crash because someone added a bogus printer. - # Plus we want to give the user a way to list unknown printers. - return "unknown" - - def list_pretty_printers(self, pretty_printers, name_re, subname_re): - """Print a list of pretty-printers.""" - # A potential enhancement is to provide an option to list printers in - # "lookup order" (i.e. unsorted). - sorted_pretty_printers = sorted (copy.copy(pretty_printers), - key = self.printer_name) - for printer in sorted_pretty_printers: - name = self.printer_name(printer) - enabled = self.enabled_string(printer) - if name_re.match(name): - print (" %s%s" % (name, enabled)) - if (hasattr(printer, "subprinters") and - printer.subprinters is not None): - sorted_subprinters = sorted (copy.copy(printer.subprinters), - key = self.printer_name) - for subprinter in sorted_subprinters: - if (not subname_re or - subname_re.match(subprinter.name)): - print (" %s%s" % - (subprinter.name, - self.enabled_string(subprinter))) - - def invoke1(self, title, printer_list, - obj_name_to_match, object_re, name_re, subname_re): - """Subroutine of invoke to simplify it.""" - if printer_list and object_re.match(obj_name_to_match): - print (title) - self.list_pretty_printers(printer_list, name_re, subname_re) - - def invoke(self, arg, from_tty): - """GDB calls this to perform the command.""" - (object_re, name_re, subname_re) = parse_printer_regexps(arg) - self.invoke1("global pretty-printers:", gdb.pretty_printers, - "global", object_re, name_re, subname_re) - cp = gdb.current_progspace() - self.invoke1("progspace %s pretty-printers:" % cp.filename, - cp.pretty_printers, "progspace", - object_re, name_re, subname_re) - for objfile in gdb.objfiles(): - self.invoke1(" objfile %s pretty-printers:" % objfile.filename, - objfile.pretty_printers, objfile.filename, - object_re, name_re, subname_re) - - -def count_enabled_printers(pretty_printers): - """Return a 2-tuple of number of enabled and total printers.""" - enabled = 0 - total = 0 - for printer in pretty_printers: - if (hasattr(printer, "subprinters") - and printer.subprinters is not None): - if printer_enabled_p(printer): - for subprinter in printer.subprinters: - if printer_enabled_p(subprinter): - enabled += 1 - total += len(printer.subprinters) - else: - if printer_enabled_p(printer): - enabled += 1 - total += 1 - return (enabled, total) - - -def count_all_enabled_printers(): - """Return a 2-tuble of the enabled state and total number of all printers. - This includes subprinters. - """ - enabled_count = 0 - total_count = 0 - (t_enabled, t_total) = count_enabled_printers(gdb.pretty_printers) - enabled_count += t_enabled - total_count += t_total - (t_enabled, t_total) = count_enabled_printers(gdb.current_progspace().pretty_printers) - enabled_count += t_enabled - total_count += t_total - for objfile in gdb.objfiles(): - (t_enabled, t_total) = count_enabled_printers(objfile.pretty_printers) - enabled_count += t_enabled - total_count += t_total - return (enabled_count, total_count) - - -def pluralize(text, n, suffix="s"): - """Return TEXT pluralized if N != 1.""" - if n != 1: - return "%s%s" % (text, suffix) - else: - return text - - -def show_pretty_printer_enabled_summary(): - """Print the number of printers enabled/disabled. - We count subprinters individually. - """ - (enabled_count, total_count) = count_all_enabled_printers() - print ("%d of %d printers enabled" % (enabled_count, total_count)) - - -def do_enable_pretty_printer_1 (pretty_printers, name_re, subname_re, flag): - """Worker for enabling/disabling pretty-printers. - - Arguments: - pretty_printers: list of pretty-printers - name_re: regular-expression object to select printers - subname_re: regular expression object to select subprinters or None - if all are affected - flag: True for Enable, False for Disable - - Returns: - The number of printers affected. - This is just for informational purposes for the user. - """ - total = 0 - for printer in pretty_printers: - if (hasattr(printer, "name") and name_re.match(printer.name) or - hasattr(printer, "__name__") and name_re.match(printer.__name__)): - if (hasattr(printer, "subprinters") and - printer.subprinters is not None): - if not subname_re: - # Only record printers that change state. - if printer_enabled_p(printer) != flag: - for subprinter in printer.subprinters: - if printer_enabled_p(subprinter): - total += 1 - # NOTE: We preserve individual subprinter settings. - printer.enabled = flag - else: - # NOTE: Whether this actually disables the subprinter - # depends on whether the printer's lookup function supports - # the "enable" API. We can only assume it does. - for subprinter in printer.subprinters: - if subname_re.match(subprinter.name): - # Only record printers that change state. - if (printer_enabled_p(printer) and - printer_enabled_p(subprinter) != flag): - total += 1 - subprinter.enabled = flag - else: - # This printer has no subprinters. - # If the user does "disable pretty-printer .* .* foo" - # should we disable printers that don't have subprinters? - # How do we apply "foo" in this context? Since there is no - # "foo" subprinter it feels like we should skip this printer. - # There's still the issue of how to handle - # "disable pretty-printer .* .* .*", and every other variation - # that can match everything. For now punt and only support - # "disable pretty-printer .* .*" (i.e. subname is elided) - # to disable everything. - if not subname_re: - # Only record printers that change state. - if printer_enabled_p(printer) != flag: - total += 1 - printer.enabled = flag - return total - - -def do_enable_pretty_printer (arg, flag): - """Internal worker for enabling/disabling pretty-printers.""" - (object_re, name_re, subname_re) = parse_printer_regexps(arg) - - total = 0 - if object_re.match("global"): - total += do_enable_pretty_printer_1(gdb.pretty_printers, - name_re, subname_re, flag) - cp = gdb.current_progspace() - if object_re.match("progspace"): - total += do_enable_pretty_printer_1(cp.pretty_printers, - name_re, subname_re, flag) - for objfile in gdb.objfiles(): - if object_re.match(objfile.filename): - total += do_enable_pretty_printer_1(objfile.pretty_printers, - name_re, subname_re, flag) - - if flag: - state = "enabled" - else: - state = "disabled" - print ("%d %s %s" % (total, pluralize("printer", total), state)) - - # Print the total list of printers currently enabled/disabled. - # This is to further assist the user in determining whether the result - # is expected. Since we use regexps to select it's useful. - show_pretty_printer_enabled_summary() - - -# Enable/Disable one or more pretty-printers. -# -# This is intended for use when a broken pretty-printer is shipped/installed -# and the user wants to disable that printer without disabling all the other -# printers. -# -# A useful addition would be -v (verbose) to show each printer affected. - -class EnablePrettyPrinter (gdb.Command): - """GDB command to enable the specified pretty-printer. - - Usage: enable pretty-printer [object-regexp [name-regexp]] - - OBJECT-REGEXP is a regular expression matching the objects to examine. - Objects are "global", the program space's file, and the objfiles within - that program space. - - NAME-REGEXP matches the name of the pretty-printer. - Individual printers in a collection are named as - printer-name;subprinter-name. - """ - - def __init__(self): - super(EnablePrettyPrinter, self).__init__("enable pretty-printer", - gdb.COMMAND_DATA) - - def invoke(self, arg, from_tty): - """GDB calls this to perform the command.""" - do_enable_pretty_printer(arg, True) - - -class DisablePrettyPrinter (gdb.Command): - """GDB command to disable the specified pretty-printer. - - Usage: disable pretty-printer [object-regexp [name-regexp]] - - OBJECT-REGEXP is a regular expression matching the objects to examine. - Objects are "global", the program space's file, and the objfiles within - that program space. - - NAME-REGEXP matches the name of the pretty-printer. - Individual printers in a collection are named as - printer-name;subprinter-name. - """ - - def __init__(self): - super(DisablePrettyPrinter, self).__init__("disable pretty-printer", - gdb.COMMAND_DATA) - - def invoke(self, arg, from_tty): - """GDB calls this to perform the command.""" - do_enable_pretty_printer(arg, False) - - -def register_pretty_printer_commands(): - """Call from a top level script to install the pretty-printer commands.""" - InfoPrettyPrinter() - EnablePrettyPrinter() - DisablePrettyPrinter() - -register_pretty_printer_commands() diff --git a/share/gdb/python/gdb/command/prompt.py b/share/gdb/python/gdb/command/prompt.py deleted file mode 100644 index 394e40c..0000000 --- a/share/gdb/python/gdb/command/prompt.py +++ /dev/null @@ -1,66 +0,0 @@ -# Extended prompt. -# Copyright (C) 2011-2013 Free Software Foundation, Inc. - -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see <http://www.gnu.org/licenses/>. - -"""GDB command for working with extended prompts.""" - -import gdb -import gdb.prompt - -class _ExtendedPrompt(gdb.Parameter): - - """Set the extended prompt. - -Usage: set extended-prompt VALUE - -Substitutions are applied to VALUE to compute the real prompt. - -The currently defined substitutions are: - -""" - # Add the prompt library's dynamically generated help to the - # __doc__ string. - __doc__ = __doc__ + gdb.prompt.prompt_help() - - set_doc = "Set the extended prompt." - show_doc = "Show the extended prompt." - - def __init__(self): - super(_ExtendedPrompt, self).__init__("extended-prompt", - gdb.COMMAND_SUPPORT, - gdb.PARAM_STRING_NOESCAPE) - self.value = '' - self.hook_set = False - - def get_show_string (self, pvalue): - if self.value is not '': - return "The extended prompt is: " + self.value - else: - return "The extended prompt is not set." - - def get_set_string (self): - if self.hook_set == False: - gdb.prompt_hook = self.before_prompt_hook - self.hook_set = True - return "" - - def before_prompt_hook(self, current): - if self.value is not '': - newprompt = gdb.prompt.substitute_prompt(self.value) - return newprompt.replace('\\', '\\\\') - else: - return None - -_ExtendedPrompt() diff --git a/share/gdb/python/gdb/command/type_printers.py b/share/gdb/python/gdb/command/type_printers.py deleted file mode 100644 index 81f2ea1..0000000 --- a/share/gdb/python/gdb/command/type_printers.py +++ /dev/null @@ -1,125 +0,0 @@ -# Type printer commands. -# Copyright (C) 2010-2013 Free Software Foundation, Inc. - -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see <http://www.gnu.org/licenses/>. - -import copy -import gdb - -"""GDB commands for working with type-printers.""" - -class InfoTypePrinter(gdb.Command): - """GDB command to list all registered type-printers. - - Usage: info type-printers - """ - - def __init__ (self): - super(InfoTypePrinter, self).__init__("info type-printers", - gdb.COMMAND_DATA) - - def list_type_printers(self, type_printers): - """Print a list of type printers.""" - # A potential enhancement is to provide an option to list printers in - # "lookup order" (i.e. unsorted). - sorted_type_printers = sorted (copy.copy(type_printers), - key = lambda x: x.name) - for printer in sorted_type_printers: - if printer.enabled: - enabled = '' - else: - enabled = " [disabled]" - print (" %s%s" % (printer.name, enabled)) - - def invoke(self, arg, from_tty): - """GDB calls this to perform the command.""" - sep = '' - for objfile in gdb.objfiles(): - if objfile.type_printers: - print ("%sType printers for %s:" % (sep, objfile.name)) - self.list_type_printers(objfile.type_printers) - sep = '\n' - if gdb.current_progspace().type_printers: - print ("%sType printers for program space:" % sep) - self.list_type_printers(gdb.current_progspace().type_printers) - sep = '\n' - if gdb.type_printers: - print ("%sGlobal type printers:" % sep) - self.list_type_printers(gdb.type_printers) - -class _EnableOrDisableCommand(gdb.Command): - def __init__(self, setting, name): - super(_EnableOrDisableCommand, self).__init__(name, gdb.COMMAND_DATA) - self.setting = setting - - def set_some(self, name, printers): - result = False - for p in printers: - if name == p.name: - p.enabled = self.setting - result = True - return result - - def invoke(self, arg, from_tty): - """GDB calls this to perform the command.""" - for name in arg.split(): - ok = False - for objfile in gdb.objfiles(): - if self.set_some(name, objfile.type_printers): - ok = True - if self.set_some(name, gdb.current_progspace().type_printers): - ok = True - if self.set_some(name, gdb.type_printers): - ok = True - if not ok: - print ("No type printer named '%s'" % name) - - def add_some(self, result, word, printers): - for p in printers: - if p.name.startswith(word): - result.append(p.name) - - def complete(self, text, word): - result = [] - for objfile in gdb.objfiles(): - self.add_some(result, word, objfile.type_printers) - self.add_some(result, word, gdb.current_progspace().type_printers) - self.add_some(result, word, gdb.type_printers) - return result - -class EnableTypePrinter(_EnableOrDisableCommand): - """GDB command to enable the specified type printer. - - Usage: enable type-printer NAME - - NAME is the name of the type-printer. - """ - - def __init__(self): - super(EnableTypePrinter, self).__init__(True, "enable type-printer") - -class DisableTypePrinter(_EnableOrDisableCommand): - """GDB command to disable the specified type-printer. - - Usage: disable type-printer NAME - - NAME is the name of the type-printer. - """ - - def __init__(self): - super(DisableTypePrinter, self).__init__(False, "disable type-printer") - -InfoTypePrinter() -EnableTypePrinter() -DisableTypePrinter() diff --git a/share/gdb/python/gdb/function/__init__.py b/share/gdb/python/gdb/function/__init__.py deleted file mode 100644 index 755bff9..0000000 --- a/share/gdb/python/gdb/function/__init__.py +++ /dev/null @@ -1,14 +0,0 @@ -# Copyright (C) 2012-2013 Free Software Foundation, Inc. - -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see <http://www.gnu.org/licenses/>. diff --git a/share/gdb/python/gdb/function/strfns.py b/share/gdb/python/gdb/function/strfns.py deleted file mode 100644 index efdf950..0000000 --- a/share/gdb/python/gdb/function/strfns.py +++ /dev/null @@ -1,108 +0,0 @@ -# Useful gdb string convenience functions. -# Copyright (C) 2012-2013 Free Software Foundation, Inc. - -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see <http://www.gnu.org/licenses/>. - -"""$_memeq, $_strlen, $_streq, $_regex""" - -import gdb -import re - - -class _MemEq(gdb.Function): - """$_memeq - compare bytes of memory - -Usage: - $_memeq(a, b, len) - -Returns: - True if len bytes at a and b compare equally. -""" - def __init__(self): - super(_MemEq, self).__init__("_memeq") - - def invoke(self, a, b, length): - if length < 0: - raise ValueError("length must be non-negative") - if length == 0: - return True - # The argument(s) to vector are [low_bound,]high_bound. - byte_vector = gdb.lookup_type("char").vector(length - 1) - ptr_byte_vector = byte_vector.pointer() - a_ptr = a.reinterpret_cast(ptr_byte_vector) - b_ptr = b.reinterpret_cast(ptr_byte_vector) - return a_ptr.dereference() == b_ptr.dereference() - - -class _StrLen(gdb.Function): - """$_strlen - compute string length - -Usage: - $_strlen(a) - -Returns: - Length of string a, assumed to be a string in the current language. -""" - def __init__(self): - super(_StrLen, self).__init__("_strlen") - - def invoke(self, a): - s = a.string() - return len(s) - - -class _StrEq(gdb.Function): - """$_streq - check string equality - -Usage: - $_streq(a, b) - -Returns: - True if a and b are identical strings in the current language. - -Example (amd64-linux): - catch syscall open - cond $bpnum $_streq((char*) $rdi, "foo") -""" - def __init__(self): - super(_StrEq, self).__init__("_streq") - - def invoke(self, a, b): - return a.string() == b.string() - - -class _RegEx(gdb.Function): - """$_regex - check if a string matches a regular expression - -Usage: - $_regex(string, regex) - -Returns: - True if string str (in the current language) matches the - regular expression regex. -""" - def __init__(self): - super(_RegEx, self).__init__("_regex") - - def invoke(self, string, regex): - s = string.string() - r = re.compile(regex.string()) - return bool(r.match(s)) - - -# GDB will import us automagically via gdb/__init__.py. -_MemEq() -_StrLen() -_StrEq() -_RegEx() diff --git a/share/gdb/python/gdb/printing.py b/share/gdb/python/gdb/printing.py deleted file mode 100644 index 785a407..0000000 --- a/share/gdb/python/gdb/printing.py +++ /dev/null @@ -1,263 +0,0 @@ -# Pretty-printer utilities. -# Copyright (C) 2010-2013 Free Software Foundation, Inc. - -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see <http://www.gnu.org/licenses/>. - -"""Utilities for working with pretty-printers.""" - -import gdb -import gdb.types -import re -import sys - -if sys.version_info[0] > 2: - # Python 3 removed basestring and long - basestring = str - long = int - -class PrettyPrinter(object): - """A basic pretty-printer. - - Attributes: - name: A unique string among all printers for the context in which - it is defined (objfile, progspace, or global(gdb)), and should - meaningfully describe what can be pretty-printed. - E.g., "StringPiece" or "protobufs". - subprinters: An iterable object with each element having a `name' - attribute, and, potentially, "enabled" attribute. - Or this is None if there are no subprinters. - enabled: A boolean indicating if the printer is enabled. - - Subprinters are for situations where "one" pretty-printer is actually a - collection of several printers. E.g., The libstdc++ pretty-printer has - a pretty-printer for each of several different types, based on regexps. - """ - - # While one might want to push subprinters into the subclass, it's - # present here to formalize such support to simplify - # commands/pretty_printers.py. - - def __init__(self, name, subprinters=None): - self.name = name - self.subprinters = subprinters - self.enabled = True - - def __call__(self, val): - # The subclass must define this. - raise NotImplementedError("PrettyPrinter __call__") - - -class SubPrettyPrinter(object): - """Baseclass for sub-pretty-printers. - - Sub-pretty-printers needn't use this, but it formalizes what's needed. - - Attributes: - name: The name of the subprinter. - enabled: A boolean indicating if the subprinter is enabled. - """ - - def __init__(self, name): - self.name = name - self.enabled = True - - -def register_pretty_printer(obj, printer, replace=False): - """Register pretty-printer PRINTER with OBJ. - - The printer is added to the front of the search list, thus one can override - an existing printer if one needs to. Use a different name when overriding - an existing printer, otherwise an exception will be raised; multiple - printers with the same name are disallowed. - - Arguments: - obj: Either an objfile, progspace, or None (in which case the printer - is registered globally). - printer: Either a function of one argument (old way) or any object - which has attributes: name, enabled, __call__. - replace: If True replace any existing copy of the printer. - Otherwise if the printer already exists raise an exception. - - Returns: - Nothing. - - Raises: - TypeError: A problem with the type of the printer. - ValueError: The printer's name contains a semicolon ";". - RuntimeError: A printer with the same name is already registered. - - If the caller wants the printer to be listable and disableable, it must - follow the PrettyPrinter API. This applies to the old way (functions) too. - If printer is an object, __call__ is a method of two arguments: - self, and the value to be pretty-printed. See PrettyPrinter. - """ - - # Watch for both __name__ and name. - # Functions get the former for free, but we don't want to use an - # attribute named __foo__ for pretty-printers-as-objects. - # If printer has both, we use `name'. - if not hasattr(printer, "__name__") and not hasattr(printer, "name"): - raise TypeError("printer missing attribute: name") - if hasattr(printer, "name") and not hasattr(printer, "enabled"): - raise TypeError("printer missing attribute: enabled") - if not hasattr(printer, "__call__"): - raise TypeError("printer missing attribute: __call__") - - if obj is None: - if gdb.parameter("verbose"): - gdb.write("Registering global %s pretty-printer ...\n" % name) - obj = gdb - else: - if gdb.parameter("verbose"): - gdb.write("Registering %s pretty-printer for %s ...\n" % - (printer.name, obj.filename)) - - if hasattr(printer, "name"): - if not isinstance(printer.name, basestring): - raise TypeError("printer name is not a string") - # If printer provides a name, make sure it doesn't contain ";". - # Semicolon is used by the info/enable/disable pretty-printer commands - # to delimit subprinters. - if printer.name.find(";") >= 0: - raise ValueError("semicolon ';' in printer name") - # Also make sure the name is unique. - # Alas, we can't do the same for functions and __name__, they could - # all have a canonical name like "lookup_function". - # PERF: gdb records printers in a list, making this inefficient. - i = 0 - for p in obj.pretty_printers: - if hasattr(p, "name") and p.name == printer.name: - if replace: - del obj.pretty_printers[i] - break - else: - raise RuntimeError("pretty-printer already registered: %s" % - printer.name) - i = i + 1 - - obj.pretty_printers.insert(0, printer) - - -class RegexpCollectionPrettyPrinter(PrettyPrinter): - """Class for implementing a collection of regular-expression based pretty-printers. - - Intended usage: - - pretty_printer = RegexpCollectionPrettyPrinter("my_library") - pretty_printer.add_printer("myclass1", "^myclass1$", MyClass1Printer) - ... - pretty_printer.add_printer("myclassN", "^myclassN$", MyClassNPrinter) - register_pretty_printer(obj, pretty_printer) - """ - - class RegexpSubprinter(SubPrettyPrinter): - def __init__(self, name, regexp, gen_printer): - super(RegexpCollectionPrettyPrinter.RegexpSubprinter, self).__init__(name) - self.regexp = regexp - self.gen_printer = gen_printer - self.compiled_re = re.compile(regexp) - - def __init__(self, name): - super(RegexpCollectionPrettyPrinter, self).__init__(name, []) - - def add_printer(self, name, regexp, gen_printer): - """Add a printer to the list. - - The printer is added to the end of the list. - - Arguments: - name: The name of the subprinter. - regexp: The regular expression, as a string. - gen_printer: A function/method that given a value returns an - object to pretty-print it. - - Returns: - Nothing. - """ - - # NOTE: A previous version made the name of each printer the regexp. - # That makes it awkward to pass to the enable/disable commands (it's - # cumbersome to make a regexp of a regexp). So now the name is a - # separate parameter. - - self.subprinters.append(self.RegexpSubprinter(name, regexp, - gen_printer)) - - def __call__(self, val): - """Lookup the pretty-printer for the provided value.""" - - # Get the type name. - typename = gdb.types.get_basic_type(val.type).tag - if not typename: - return None - - # Iterate over table of type regexps to determine - # if a printer is registered for that type. - # Return an instantiation of the printer if found. - for printer in self.subprinters: - if printer.enabled and printer.compiled_re.search(typename): - return printer.gen_printer(val) - - # Cannot find a pretty printer. Return None. - return None - -# A helper class for printing enum types. This class is instantiated -# with a list of enumerators to print a particular Value. -class _EnumInstance: - def __init__(self, enumerators, val): - self.enumerators = enumerators - self.val = val - - def to_string(self): - flag_list = [] - v = long(self.val) - any_found = False - for (e_name, e_value) in self.enumerators: - if v & e_value != 0: - flag_list.append(e_name) - v = v & ~e_value - any_found = True - if not any_found or v != 0: - # Leftover value. - flag_list.append('<unknown: 0x%x>' % v) - return "0x%x [%s]" % (self.val, " | ".join(flag_list)) - -class FlagEnumerationPrinter(PrettyPrinter): - """A pretty-printer which can be used to print a flag-style enumeration. - A flag-style enumeration is one where the enumerators are or'd - together to create values. The new printer will print these - symbolically using '|' notation. The printer must be registered - manually. This printer is most useful when an enum is flag-like, - but has some overlap. GDB's built-in printing will not handle - this case, but this printer will attempt to.""" - - def __init__(self, enum_type): - super(FlagEnumerationPrinter, self).__init__(enum_type) - self.initialized = False - - def __call__(self, val): - if not self.initialized: - self.initialized = True - flags = gdb.lookup_type(self.name) - self.enumerators = [] - for field in flags.fields(): - self.enumerators.append((field.name, field.enumval)) - # Sorting the enumerators by value usually does the right - # thing. - self.enumerators.sort(key = lambda x: x.enumval) - - if self.enabled: - return _EnumInstance(self.enumerators, val) - else: - return None diff --git a/share/gdb/python/gdb/prompt.py b/share/gdb/python/gdb/prompt.py deleted file mode 100644 index bb1975b..0000000 --- a/share/gdb/python/gdb/prompt.py +++ /dev/null @@ -1,148 +0,0 @@ -# Extended prompt utilities. -# Copyright (C) 2011-2013 Free Software Foundation, Inc. - -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see <http://www.gnu.org/licenses/>. - -""" Extended prompt library functions.""" - -import gdb -import os - -def _prompt_pwd(ignore): - "The current working directory." - return os.getcwdu() - -def _prompt_object_attr(func, what, attr, nattr): - """Internal worker for fetching GDB attributes.""" - if attr is None: - attr = nattr - try: - obj = func() - except gdb.error: - return '<no %s>' % what - if hasattr(obj, attr): - result = getattr(obj, attr) - if callable(result): - result = result() - return result - else: - return '<no attribute %s on current %s>' % (attr, what) - -def _prompt_frame(attr): - "The selected frame; an argument names a frame parameter." - return _prompt_object_attr(gdb.selected_frame, 'frame', attr, 'name') - -def _prompt_thread(attr): - "The selected thread; an argument names a thread parameter." - return _prompt_object_attr(gdb.selected_thread, 'thread', attr, 'num') - -def _prompt_version(attr): - "The version of GDB." - return gdb.VERSION - -def _prompt_esc(attr): - "The ESC character." - return '\033' - -def _prompt_bs(attr): - "A backslash." - return '\\' - -def _prompt_n(attr): - "A newline." - return '\n' - -def _prompt_r(attr): - "A carriage return." - return '\r' - -def _prompt_param(attr): - "A parameter's value; the argument names the parameter." - return gdb.parameter(attr) - -def _prompt_noprint_begin(attr): - "Begins a sequence of non-printing characters." - return '\001' - -def _prompt_noprint_end(attr): - "Ends a sequence of non-printing characters." - return '\002' - -prompt_substitutions = { - 'e': _prompt_esc, - '\\': _prompt_bs, - 'n': _prompt_n, - 'r': _prompt_r, - 'v': _prompt_version, - 'w': _prompt_pwd, - 'f': _prompt_frame, - 't': _prompt_thread, - 'p': _prompt_param, - '[': _prompt_noprint_begin, - ']': _prompt_noprint_end -} - -def prompt_help(): - """Generate help dynamically from the __doc__ strings of attribute - functions.""" - - result = '' - keys = sorted (prompt_substitutions.keys()) - for key in keys: - result += ' \\%s\t%s\n' % (key, prompt_substitutions[key].__doc__) - result += """ -A substitution can be used in a simple form, like "\\f". -An argument can also be passed to it, like "\\f{name}". -The meaning of the argument depends on the particular substitution.""" - return result - -def substitute_prompt(prompt): - "Perform substitutions on PROMPT." - - result = '' - plen = len(prompt) - i = 0 - while i < plen: - if prompt[i] == '\\': - i = i + 1 - if i >= plen: - break - cmdch = prompt[i] - - if cmdch in prompt_substitutions: - cmd = prompt_substitutions[cmdch] - - if i + 1 < plen and prompt[i + 1] == '{': - j = i + 1 - while j < plen and prompt[j] != '}': - j = j + 1 - # Just ignore formatting errors. - if j >= plen or prompt[j] != '}': - arg = None - else: - arg = prompt[i + 2 : j] - i = j - else: - arg = None - result += str(cmd(arg)) - else: - # Unrecognized escapes are turned into the escaped - # character itself. - result += prompt[i] - else: - result += prompt[i] - - i = i + 1 - - return result diff --git a/share/gdb/python/gdb/types.py b/share/gdb/python/gdb/types.py deleted file mode 100644 index ffc817c..0000000 --- a/share/gdb/python/gdb/types.py +++ /dev/null @@ -1,176 +0,0 @@ -# Type utilities. -# Copyright (C) 2010-2013 Free Software Foundation, Inc. - -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation; either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see <http://www.gnu.org/licenses/>. - -"""Utilities for working with gdb.Types.""" - -import gdb - - -def get_basic_type(type_): - """Return the "basic" type of a type. - - Arguments: - type_: The type to reduce to its basic type. - - Returns: - type_ with const/volatile is stripped away, - and typedefs/references converted to the underlying type. - """ - - while (type_.code == gdb.TYPE_CODE_REF or - type_.code == gdb.TYPE_CODE_TYPEDEF): - if type_.code == gdb.TYPE_CODE_REF: - type_ = type_.target() - else: - type_ = type_.strip_typedefs() - return type_.unqualified() - - -def has_field(type_, field): - """Return True if a type has the specified field. - - Arguments: - type_: The type to examine. - It must be one of gdb.TYPE_CODE_STRUCT, gdb.TYPE_CODE_UNION. - field: The name of the field to look up. - - Returns: - True if the field is present either in type_ or any baseclass. - - Raises: - TypeError: The type is not a struct or union. - """ - - type_ = get_basic_type(type_) - if (type_.code != gdb.TYPE_CODE_STRUCT and - type_.code != gdb.TYPE_CODE_UNION): - raise TypeError("not a struct or union") - for f in type_.fields(): - if f.is_base_class: - if has_field(f.type, field): - return True - else: - # NOTE: f.name could be None - if f.name == field: - return True - return False - - -def make_enum_dict(enum_type): - """Return a dictionary from a program's enum type. - - Arguments: - enum_type: The enum to compute the dictionary for. - - Returns: - The dictionary of the enum. - - Raises: - TypeError: The type is not an enum. - """ - - if enum_type.code != gdb.TYPE_CODE_ENUM: - raise TypeError("not an enum type") - enum_dict = {} - for field in enum_type.fields(): - # The enum's value is stored in "enumval". - enum_dict[field.name] = field.enumval - return enum_dict - - -def deep_items (type_): - """Return an iterator that recursively traverses anonymous fields. - - Arguments: - type_: The type to traverse. It should be one of - gdb.TYPE_CODE_STRUCT or gdb.TYPE_CODE_UNION. - - Returns: - an iterator similar to gdb.Type.iteritems(), i.e., it returns - pairs of key, value, but for any anonymous struct or union - field that field is traversed recursively, depth-first. - """ - for k, v in type_.iteritems (): - if k: - yield k, v - else: - for i in deep_items (v.type): - yield i - -class TypePrinter(object): - """The base class for type printers. - - Instances of this type can be used to substitute type names during - 'ptype'. - - A type printer must have at least 'name' and 'enabled' attributes, - and supply an 'instantiate' method. - - The 'instantiate' method must either return None, or return an - object which has a 'recognize' method. This method must accept a - gdb.Type argument and either return None, meaning that the type - was not recognized, or a string naming the type. - """ - - def __init__(self, name): - self.name = name - self.enabled = True - - def instantiate(self): - return None - -# Helper function for computing the list of type recognizers. -def _get_some_type_recognizers(result, plist): - for printer in plist: - if printer.enabled: - inst = printer.instantiate() - if inst is not None: - result.append(inst) - return None - -def get_type_recognizers(): - "Return a list of the enabled type recognizers for the current context." - result = [] - - # First try the objfiles. - for objfile in gdb.objfiles(): - _get_some_type_recognizers(result, objfile.type_printers) - # Now try the program space. - _get_some_type_recognizers(result, gdb.current_progspace().type_printers) - # Finally, globals. - _get_some_type_recognizers(result, gdb.type_printers) - - return result - -def apply_type_recognizers(recognizers, type_obj): - """Apply the given list of type recognizers to the type TYPE_OBJ. - If any recognizer in the list recognizes TYPE_OBJ, returns the name - given by the recognizer. Otherwise, this returns None.""" - for r in recognizers: - result = r.recognize(type_obj) - if result is not None: - return result - return None - -def register_type_printer(locus, printer): - """Register a type printer. - PRINTER is the type printer instance. - LOCUS is either an objfile, a program space, or None, indicating - global registration.""" - - if locus is None: - locus = gdb - locus.type_printers.insert(0, printer) diff --git a/share/gdb/syscalls/amd64-linux.xml b/share/gdb/syscalls/amd64-linux.xml deleted file mode 100644 index bf3da5d..0000000 --- a/share/gdb/syscalls/amd64-linux.xml +++ /dev/null @@ -1,314 +0,0 @@ -<?xml version="1.0"?> -<!-- Copyright (C) 2009-2013 Free Software Foundation, Inc. - - Copying and distribution of this file, with or without modification, - are permitted in any medium without royalty provided the copyright - notice and this notice are preserved. --> - -<!DOCTYPE feature SYSTEM "gdb-syscalls.dtd"> - -<!-- This file was generated using the following file: - - /usr/src/linux/arch/x86/include/asm/unistd_64.h - - The file mentioned above belongs to the Linux Kernel. --> - -<syscalls_info> - <syscall name="read" number="0"/> - <syscall name="write" number="1"/> - <syscall name="open" number="2"/> - <syscall name="close" number="3"/> - <syscall name="stat" number="4"/> - <syscall name="fstat" number="5"/> - <syscall name="lstat" number="6"/> - <syscall name="poll" number="7"/> - <syscall name="lseek" number="8"/> - <syscall name="mmap" number="9"/> - <syscall name="mprotect" number="10"/> - <syscall name="munmap" number="11"/> - <syscall name="brk" number="12"/> - <syscall name="rt_sigaction" number="13"/> - <syscall name="rt_sigprocmask" number="14"/> - <syscall name="rt_sigreturn" number="15"/> - <syscall name="ioctl" number="16"/> - <syscall name="pread64" number="17"/> - <syscall name="pwrite64" number="18"/> - <syscall name="readv" number="19"/> - <syscall name="writev" number="20"/> - <syscall name="access" number="21"/> - <syscall name="pipe" number="22"/> - <syscall name="select" number="23"/> - <syscall name="sched_yield" number="24"/> - <syscall name="mremap" number="25"/> - <syscall name="msync" number="26"/> - <syscall name="mincore" number="27"/> - <syscall name="madvise" number="28"/> - <syscall name="shmget" number="29"/> - <syscall name="shmat" number="30"/> - <syscall name="shmctl" number="31"/> - <syscall name="dup" number="32"/> - <syscall name="dup2" number="33"/> - <syscall name="pause" number="34"/> - <syscall name="nanosleep" number="35"/> - <syscall name="getitimer" number="36"/> - <syscall name="alarm" number="37"/> - <syscall name="setitimer" number="38"/> - <syscall name="getpid" number="39"/> - <syscall name="sendfile" number="40"/> - <syscall name="socket" number="41"/> - <syscall name="connect" number="42"/> - <syscall name="accept" number="43"/> - <syscall name="sendto" number="44"/> - <syscall name="recvfrom" number="45"/> - <syscall name="sendmsg" number="46"/> - <syscall name="recvmsg" number="47"/> - <syscall name="shutdown" number="48"/> - <syscall name="bind" number="49"/> - <syscall name="listen" number="50"/> - <syscall name="getsockname" number="51"/> - <syscall name="getpeername" number="52"/> - <syscall name="socketpair" number="53"/> - <syscall name="setsockopt" number="54"/> - <syscall name="getsockopt" number="55"/> - <syscall name="clone" number="56"/> - <syscall name="fork" number="57"/> - <syscall name="vfork" number="58"/> - <syscall name="execve" number="59"/> - <syscall name="exit" number="60"/> - <syscall name="wait4" number="61"/> - <syscall name="kill" number="62"/> - <syscall name="uname" number="63"/> - <syscall name="semget" number="64"/> - <syscall name="semop" number="65"/> - <syscall name="semctl" number="66"/> - <syscall name="shmdt" number="67"/> - <syscall name="msgget" number="68"/> - <syscall name="msgsnd" number="69"/> - <syscall name="msgrcv" number="70"/> - <syscall name="msgctl" number="71"/> - <syscall name="fcntl" number="72"/> - <syscall name="flock" number="73"/> - <syscall name="fsync" number="74"/> - <syscall name="fdatasync" number="75"/> - <syscall name="truncate" number="76"/> - <syscall name="ftruncate" number="77"/> - <syscall name="getdents" number="78"/> - <syscall name="getcwd" number="79"/> - <syscall name="chdir" number="80"/> - <syscall name="fchdir" number="81"/> - <syscall name="rename" number="82"/> - <syscall name="mkdir" number="83"/> - <syscall name="rmdir" number="84"/> - <syscall name="creat" number="85"/> - <syscall name="link" number="86"/> - <syscall name="unlink" number="87"/> - <syscall name="symlink" number="88"/> - <syscall name="readlink" number="89"/> - <syscall name="chmod" number="90"/> - <syscall name="fchmod" number="91"/> - <syscall name="chown" number="92"/> - <syscall name="fchown" number="93"/> - <syscall name="lchown" number="94"/> - <syscall name="umask" number="95"/> - <syscall name="gettimeofday" number="96"/> - <syscall name="getrlimit" number="97"/> - <syscall name="getrusage" number="98"/> - <syscall name="sysinfo" number="99"/> - <syscall name="times" number="100"/> - <syscall name="ptrace" number="101"/> - <syscall name="getuid" number="102"/> - <syscall name="syslog" number="103"/> - <syscall name="getgid" number="104"/> - <syscall name="setuid" number="105"/> - <syscall name="setgid" number="106"/> - <syscall name="geteuid" number="107"/> - <syscall name="getegid" number="108"/> - <syscall name="setpgid" number="109"/> - <syscall name="getppid" number="110"/> - <syscall name="getpgrp" number="111"/> - <syscall name="setsid" number="112"/> - <syscall name="setreuid" number="113"/> - <syscall name="setregid" number="114"/> - <syscall name="getgroups" number="115"/> - <syscall name="setgroups" number="116"/> - <syscall name="setresuid" number="117"/> - <syscall name="getresuid" number="118"/> - <syscall name="setresgid" number="119"/> - <syscall name="getresgid" number="120"/> - <syscall name="getpgid" number="121"/> - <syscall name="setfsuid" number="122"/> - <syscall name="setfsgid" number="123"/> - <syscall name="getsid" number="124"/> - <syscall name="capget" number="125"/> - <syscall name="capset" number="126"/> - <syscall name="rt_sigpending" number="127"/> - <syscall name="rt_sigtimedwait" number="128"/> - <syscall name="rt_sigqueueinfo" number="129"/> - <syscall name="rt_sigsuspend" number="130"/> - <syscall name="sigaltstack" number="131"/> - <syscall name="utime" number="132"/> - <syscall name="mknod" number="133"/> - <syscall name="uselib" number="134"/> - <syscall name="personality" number="135"/> - <syscall name="ustat" number="136"/> - <syscall name="statfs" number="137"/> - <syscall name="fstatfs" number="138"/> - <syscall name="sysfs" number="139"/> - <syscall name="getpriority" number="140"/> - <syscall name="setpriority" number="141"/> - <syscall name="sched_setparam" number="142"/> - <syscall name="sched_getparam" number="143"/> - <syscall name="sched_setscheduler" number="144"/> - <syscall name="sched_getscheduler" number="145"/> - <syscall name="sched_get_priority_max" number="146"/> - <syscall name="sched_get_priority_min" number="147"/> - <syscall name="sched_rr_get_interval" number="148"/> - <syscall name="mlock" number="149"/> - <syscall name="munlock" number="150"/> - <syscall name="mlockall" number="151"/> - <syscall name="munlockall" number="152"/> - <syscall name="vhangup" number="153"/> - <syscall name="modify_ldt" number="154"/> - <syscall name="pivot_root" number="155"/> - <syscall name="_sysctl" number="156"/> - <syscall name="prctl" number="157"/> - <syscall name="arch_prctl" number="158"/> - <syscall name="adjtimex" number="159"/> - <syscall name="setrlimit" number="160"/> - <syscall name="chroot" number="161"/> - <syscall name="sync" number="162"/> - <syscall name="acct" number="163"/> - <syscall name="settimeofday" number="164"/> - <syscall name="mount" number="165"/> - <syscall name="umount2" number="166"/> - <syscall name="swapon" number="167"/> - <syscall name="swapoff" number="168"/> - <syscall name="reboot" number="169"/> - <syscall name="sethostname" number="170"/> - <syscall name="setdomainname" number="171"/> - <syscall name="iopl" number="172"/> - <syscall name="ioperm" number="173"/> - <syscall name="create_module" number="174"/> - <syscall name="init_module" number="175"/> - <syscall name="delete_module" number="176"/> - <syscall name="get_kernel_syms" number="177"/> - <syscall name="query_module" number="178"/> - <syscall name="quotactl" number="179"/> - <syscall name="nfsservctl" number="180"/> - <syscall name="getpmsg" number="181"/> - <syscall name="putpmsg" number="182"/> - <syscall name="afs_syscall" number="183"/> - <syscall name="tuxcall" number="184"/> - <syscall name="security" number="185"/> - <syscall name="gettid" number="186"/> - <syscall name="readahead" number="187"/> - <syscall name="setxattr" number="188"/> - <syscall name="lsetxattr" number="189"/> - <syscall name="fsetxattr" number="190"/> - <syscall name="getxattr" number="191"/> - <syscall name="lgetxattr" number="192"/> - <syscall name="fgetxattr" number="193"/> - <syscall name="listxattr" number="194"/> - <syscall name="llistxattr" number="195"/> - <syscall name="flistxattr" number="196"/> - <syscall name="removexattr" number="197"/> - <syscall name="lremovexattr" number="198"/> - <syscall name="fremovexattr" number="199"/> - <syscall name="tkill" number="200"/> - <syscall name="time" number="201"/> - <syscall name="futex" number="202"/> - <syscall name="sched_setaffinity" number="203"/> - <syscall name="sched_getaffinity" number="204"/> - <syscall name="set_thread_area" number="205"/> - <syscall name="io_setup" number="206"/> - <syscall name="io_destroy" number="207"/> - <syscall name="io_getevents" number="208"/> - <syscall name="io_submit" number="209"/> - <syscall name="io_cancel" number="210"/> - <syscall name="get_thread_area" number="211"/> - <syscall name="lookup_dcookie" number="212"/> - <syscall name="epoll_create" number="213"/> - <syscall name="epoll_ctl_old" number="214"/> - <syscall name="epoll_wait_old" number="215"/> - <syscall name="remap_file_pages" number="216"/> - <syscall name="getdents64" number="217"/> - <syscall name="set_tid_address" number="218"/> - <syscall name="restart_syscall" number="219"/> - <syscall name="semtimedop" number="220"/> - <syscall name="fadvise64" number="221"/> - <syscall name="timer_create" number="222"/> - <syscall name="timer_settime" number="223"/> - <syscall name="timer_gettime" number="224"/> - <syscall name="timer_getoverrun" number="225"/> - <syscall name="timer_delete" number="226"/> - <syscall name="clock_settime" number="227"/> - <syscall name="clock_gettime" number="228"/> - <syscall name="clock_getres" number="229"/> - <syscall name="clock_nanosleep" number="230"/> - <syscall name="exit_group" number="231"/> - <syscall name="epoll_wait" number="232"/> - <syscall name="epoll_ctl" number="233"/> - <syscall name="tgkill" number="234"/> - <syscall name="utimes" number="235"/> - <syscall name="vserver" number="236"/> - <syscall name="mbind" number="237"/> - <syscall name="set_mempolicy" number="238"/> - <syscall name="get_mempolicy" number="239"/> - <syscall name="mq_open" number="240"/> - <syscall name="mq_unlink" number="241"/> - <syscall name="mq_timedsend" number="242"/> - <syscall name="mq_timedreceive" number="243"/> - <syscall name="mq_notify" number="244"/> - <syscall name="mq_getsetattr" number="245"/> - <syscall name="kexec_load" number="246"/> - <syscall name="waitid" number="247"/> - <syscall name="add_key" number="248"/> - <syscall name="request_key" number="249"/> - <syscall name="keyctl" number="250"/> - <syscall name="ioprio_set" number="251"/> - <syscall name="ioprio_get" number="252"/> - <syscall name="inotify_init" number="253"/> - <syscall name="inotify_add_watch" number="254"/> - <syscall name="inotify_rm_watch" number="255"/> - <syscall name="migrate_pages" number="256"/> - <syscall name="openat" number="257"/> - <syscall name="mkdirat" number="258"/> - <syscall name="mknodat" number="259"/> - <syscall name="fchownat" number="260"/> - <syscall name="futimesat" number="261"/> - <syscall name="newfstatat" number="262"/> - <syscall name="unlinkat" number="263"/> - <syscall name="renameat" number="264"/> - <syscall name="linkat" number="265"/> - <syscall name="symlinkat" number="266"/> - <syscall name="readlinkat" number="267"/> - <syscall name="fchmodat" number="268"/> - <syscall name="faccessat" number="269"/> - <syscall name="pselect6" number="270"/> - <syscall name="ppoll" number="271"/> - <syscall name="unshare" number="272"/> - <syscall name="set_robust_list" number="273"/> - <syscall name="get_robust_list" number="274"/> - <syscall name="splice" number="275"/> - <syscall name="tee" number="276"/> - <syscall name="sync_file_range" number="277"/> - <syscall name="vmsplice" number="278"/> - <syscall name="move_pages" number="279"/> - <syscall name="utimensat" number="280"/> - <syscall name="epoll_pwait" number="281"/> - <syscall name="signalfd" number="282"/> - <syscall name="timerfd_create" number="283"/> - <syscall name="eventfd" number="284"/> - <syscall name="fallocate" number="285"/> - <syscall name="timerfd_settime" number="286"/> - <syscall name="timerfd_gettime" number="287"/> - <syscall name="accept4" number="288"/> - <syscall name="signalfd4" number="289"/> - <syscall name="eventfd2" number="290"/> - <syscall name="epoll_create1" number="291"/> - <syscall name="dup3" number="292"/> - <syscall name="pipe2" number="293"/> - <syscall name="inotify_init1" number="294"/> - <syscall name="preadv" number="295"/> - <syscall name="pwritev" number="296"/> -</syscalls_info> diff --git a/share/gdb/syscalls/gdb-syscalls.dtd b/share/gdb/syscalls/gdb-syscalls.dtd deleted file mode 100644 index 05c1ccf..0000000 --- a/share/gdb/syscalls/gdb-syscalls.dtd +++ /dev/null @@ -1,14 +0,0 @@ -<!-- Copyright (C) 2009-2013 Free Software Foundation, Inc. - - Copying and distribution of this file, with or without modification, - are permitted in any medium without royalty provided the copyright - notice and this notice are preserved. --> - -<!-- The root element of a syscall info is <syscalls-info>. --> - -<!ELEMENT syscalls-info (syscall*)> - -<!ELEMENT syscall EMPTY> -<!ATTLIST syscall - name CDATA #REQUIRED - number CDATA #REQUIRED> diff --git a/share/gdb/syscalls/i386-linux.xml b/share/gdb/syscalls/i386-linux.xml deleted file mode 100644 index 80512d8..0000000 --- a/share/gdb/syscalls/i386-linux.xml +++ /dev/null @@ -1,340 +0,0 @@ -<?xml version="1.0"?> -<!-- Copyright (C) 2009-2013 Free Software Foundation, Inc. - - Copying and distribution of this file, with or without modification, - are permitted in any medium without royalty provided the copyright - notice and this notice are preserved. --> - -<!DOCTYPE feature SYSTEM "gdb-syscalls.dtd"> - -<!-- This file was generated using the following file: - - /usr/src/linux/arch/x86/include/asm/unistd_32.h - - The file mentioned above belongs to the Linux Kernel. --> - -<syscalls_info> - <syscall name="restart_syscall" number="0"/> - <syscall name="exit" number="1"/> - <syscall name="fork" number="2"/> - <syscall name="read" number="3"/> - <syscall name="write" number="4"/> - <syscall name="open" number="5"/> - <syscall name="close" number="6"/> - <syscall name="waitpid" number="7"/> - <syscall name="creat" number="8"/> - <syscall name="link" number="9"/> - <syscall name="unlink" number="10"/> - <syscall name="execve" number="11"/> - <syscall name="chdir" number="12"/> - <syscall name="time" number="13"/> - <syscall name="mknod" number="14"/> - <syscall name="chmod" number="15"/> - <syscall name="lchown" number="16"/> - <syscall name="break" number="17"/> - <syscall name="oldstat" number="18"/> - <syscall name="lseek" number="19"/> - <syscall name="getpid" number="20"/> - <syscall name="mount" number="21"/> - <syscall name="umount" number="22"/> - <syscall name="setuid" number="23"/> - <syscall name="getuid" number="24"/> - <syscall name="stime" number="25"/> - <syscall name="ptrace" number="26"/> - <syscall name="alarm" number="27"/> - <syscall name="oldfstat" number="28"/> - <syscall name="pause" number="29"/> - <syscall name="utime" number="30"/> - <syscall name="stty" number="31"/> - <syscall name="gtty" number="32"/> - <syscall name="access" number="33"/> - <syscall name="nice" number="34"/> - <syscall name="ftime" number="35"/> - <syscall name="sync" number="36"/> - <syscall name="kill" number="37"/> - <syscall name="rename" number="38"/> - <syscall name="mkdir" number="39"/> - <syscall name="rmdir" number="40"/> - <syscall name="dup" number="41"/> - <syscall name="pipe" number="42"/> - <syscall name="times" number="43"/> - <syscall name="prof" number="44"/> - <syscall name="brk" number="45"/> - <syscall name="setgid" number="46"/> - <syscall name="getgid" number="47"/> - <syscall name="signal" number="48"/> - <syscall name="geteuid" number="49"/> - <syscall name="getegid" number="50"/> - <syscall name="acct" number="51"/> - <syscall name="umount2" number="52"/> - <syscall name="lock" number="53"/> - <syscall name="ioctl" number="54"/> - <syscall name="fcntl" number="55"/> - <syscall name="mpx" number="56"/> - <syscall name="setpgid" number="57"/> - <syscall name="ulimit" number="58"/> - <syscall name="oldolduname" number="59"/> - <syscall name="umask" number="60"/> - <syscall name="chroot" number="61"/> - <syscall name="ustat" number="62"/> - <syscall name="dup2" number="63"/> - <syscall name="getppid" number="64"/> - <syscall name="getpgrp" number="65"/> - <syscall name="setsid" number="66"/> - <syscall name="sigaction" number="67"/> - <syscall name="sgetmask" number="68"/> - <syscall name="ssetmask" number="69"/> - <syscall name="setreuid" number="70"/> - <syscall name="setregid" number="71"/> - <syscall name="sigsuspend" number="72"/> - <syscall name="sigpending" number="73"/> - <syscall name="sethostname" number="74"/> - <syscall name="setrlimit" number="75"/> - <syscall name="getrlimit" number="76"/> - <syscall name="getrusage" number="77"/> - <syscall name="gettimeofday" number="78"/> - <syscall name="settimeofday" number="79"/> - <syscall name="getgroups" number="80"/> - <syscall name="setgroups" number="81"/> - <syscall name="select" number="82"/> - <syscall name="symlink" number="83"/> - <syscall name="oldlstat" number="84"/> - <syscall name="readlink" number="85"/> - <syscall name="uselib" number="86"/> - <syscall name="swapon" number="87"/> - <syscall name="reboot" number="88"/> - <syscall name="readdir" number="89"/> - <syscall name="mmap" number="90"/> - <syscall name="munmap" number="91"/> - <syscall name="truncate" number="92"/> - <syscall name="ftruncate" number="93"/> - <syscall name="fchmod" number="94"/> - <syscall name="fchown" number="95"/> - <syscall name="getpriority" number="96"/> - <syscall name="setpriority" number="97"/> - <syscall name="profil" number="98"/> - <syscall name="statfs" number="99"/> - <syscall name="fstatfs" number="100"/> - <syscall name="ioperm" number="101"/> - <syscall name="socketcall" number="102"/> - <syscall name="syslog" number="103"/> - <syscall name="setitimer" number="104"/> - <syscall name="getitimer" number="105"/> - <syscall name="stat" number="106"/> - <syscall name="lstat" number="107"/> - <syscall name="fstat" number="108"/> - <syscall name="olduname" number="109"/> - <syscall name="iopl" number="110"/> - <syscall name="vhangup" number="111"/> - <syscall name="idle" number="112"/> - <syscall name="vm86old" number="113"/> - <syscall name="wait4" number="114"/> - <syscall name="swapoff" number="115"/> - <syscall name="sysinfo" number="116"/> - <syscall name="ipc" number="117"/> - <syscall name="fsync" number="118"/> - <syscall name="sigreturn" number="119"/> - <syscall name="clone" number="120"/> - <syscall name="setdomainname" number="121"/> - <syscall name="uname" number="122"/> - <syscall name="modify_ldt" number="123"/> - <syscall name="adjtimex" number="124"/> - <syscall name="mprotect" number="125"/> - <syscall name="sigprocmask" number="126"/> - <syscall name="create_module" number="127"/> - <syscall name="init_module" number="128"/> - <syscall name="delete_module" number="129"/> - <syscall name="get_kernel_syms" number="130"/> - <syscall name="quotactl" number="131"/> - <syscall name="getpgid" number="132"/> - <syscall name="fchdir" number="133"/> - <syscall name="bdflush" number="134"/> - <syscall name="sysfs" number="135"/> - <syscall name="personality" number="136"/> - <syscall name="afs_syscall" number="137"/> - <syscall name="setfsuid" number="138"/> - <syscall name="setfsgid" number="139"/> - <syscall name="_llseek" number="140"/> - <syscall name="getdents" number="141"/> - <syscall name="_newselect" number="142"/> - <syscall name="flock" number="143"/> - <syscall name="msync" number="144"/> - <syscall name="readv" number="145"/> - <syscall name="writev" number="146"/> - <syscall name="getsid" number="147"/> - <syscall name="fdatasync" number="148"/> - <syscall name="_sysctl" number="149"/> - <syscall name="mlock" number="150"/> - <syscall name="munlock" number="151"/> - <syscall name="mlockall" number="152"/> - <syscall name="munlockall" number="153"/> - <syscall name="sched_setparam" number="154"/> - <syscall name="sched_getparam" number="155"/> - <syscall name="sched_setscheduler" number="156"/> - <syscall name="sched_getscheduler" number="157"/> - <syscall name="sched_yield" number="158"/> - <syscall name="sched_get_priority_max" number="159"/> - <syscall name="sched_get_priority_min" number="160"/> - <syscall name="sched_rr_get_interval" number="161"/> - <syscall name="nanosleep" number="162"/> - <syscall name="mremap" number="163"/> - <syscall name="setresuid" number="164"/> - <syscall name="getresuid" number="165"/> - <syscall name="vm86" number="166"/> - <syscall name="query_module" number="167"/> - <syscall name="poll" number="168"/> - <syscall name="nfsservctl" number="169"/> - <syscall name="setresgid" number="170"/> - <syscall name="getresgid" number="171"/> - <syscall name="prctl" number="172"/> - <syscall name="rt_sigreturn" number="173"/> - <syscall name="rt_sigaction" number="174"/> - <syscall name="rt_sigprocmask" number="175"/> - <syscall name="rt_sigpending" number="176"/> - <syscall name="rt_sigtimedwait" number="177"/> - <syscall name="rt_sigqueueinfo" number="178"/> - <syscall name="rt_sigsuspend" number="179"/> - <syscall name="pread64" number="180"/> - <syscall name="pwrite64" number="181"/> - <syscall name="chown" number="182"/> - <syscall name="getcwd" number="183"/> - <syscall name="capget" number="184"/> - <syscall name="capset" number="185"/> - <syscall name="sigaltstack" number="186"/> - <syscall name="sendfile" number="187"/> - <syscall name="getpmsg" number="188"/> - <syscall name="putpmsg" number="189"/> - <syscall name="vfork" number="190"/> - <syscall name="ugetrlimit" number="191"/> - <syscall name="mmap2" number="192"/> - <syscall name="truncate64" number="193"/> - <syscall name="ftruncate64" number="194"/> - <syscall name="stat64" number="195"/> - <syscall name="lstat64" number="196"/> - <syscall name="fstat64" number="197"/> - <syscall name="lchown32" number="198"/> - <syscall name="getuid32" number="199"/> - <syscall name="getgid32" number="200"/> - <syscall name="geteuid32" number="201"/> - <syscall name="getegid32" number="202"/> - <syscall name="setreuid32" number="203"/> - <syscall name="setregid32" number="204"/> - <syscall name="getgroups32" number="205"/> - <syscall name="setgroups32" number="206"/> - <syscall name="fchown32" number="207"/> - <syscall name="setresuid32" number="208"/> - <syscall name="getresuid32" number="209"/> - <syscall name="setresgid32" number="210"/> - <syscall name="getresgid32" number="211"/> - <syscall name="chown32" number="212"/> - <syscall name="setuid32" number="213"/> - <syscall name="setgid32" number="214"/> - <syscall name="setfsuid32" number="215"/> - <syscall name="setfsgid32" number="216"/> - <syscall name="pivot_root" number="217"/> - <syscall name="mincore" number="218"/> - <syscall name="madvise" number="219"/> - <syscall name="madvise1" number="220"/> - <syscall name="getdents64" number="221"/> - <syscall name="fcntl64" number="222"/> - <syscall name="gettid" number="224"/> - <syscall name="readahead" number="225"/> - <syscall name="setxattr" number="226"/> - <syscall name="lsetxattr" number="227"/> - <syscall name="fsetxattr" number="228"/> - <syscall name="getxattr" number="229"/> - <syscall name="lgetxattr" number="230"/> - <syscall name="fgetxattr" number="231"/> - <syscall name="listxattr" number="232"/> - <syscall name="llistxattr" number="233"/> - <syscall name="flistxattr" number="234"/> - <syscall name="removexattr" number="235"/> - <syscall name="lremovexattr" number="236"/> - <syscall name="fremovexattr" number="237"/> - <syscall name="tkill" number="238"/> - <syscall name="sendfile64" number="239"/> - <syscall name="futex" number="240"/> - <syscall name="sched_setaffinity" number="241"/> - <syscall name="sched_getaffinity" number="242"/> - <syscall name="set_thread_area" number="243"/> - <syscall name="get_thread_area" number="244"/> - <syscall name="io_setup" number="245"/> - <syscall name="io_destroy" number="246"/> - <syscall name="io_getevents" number="247"/> - <syscall name="io_submit" number="248"/> - <syscall name="io_cancel" number="249"/> - <syscall name="fadvise64" number="250"/> - <syscall name="exit_group" number="252"/> - <syscall name="lookup_dcookie" number="253"/> - <syscall name="epoll_create" number="254"/> - <syscall name="epoll_ctl" number="255"/> - <syscall name="epoll_wait" number="256"/> - <syscall name="remap_file_pages" number="257"/> - <syscall name="set_tid_address" number="258"/> - <syscall name="timer_create" number="259"/> - <syscall name="timer_settime" number="260"/> - <syscall name="timer_gettime" number="261"/> - <syscall name="timer_getoverrun" number="262"/> - <syscall name="timer_delete" number="263"/> - <syscall name="clock_settime" number="264"/> - <syscall name="clock_gettime" number="265"/> - <syscall name="clock_getres" number="266"/> - <syscall name="clock_nanosleep" number="267"/> - <syscall name="statfs64" number="268"/> - <syscall name="fstatfs64" number="269"/> - <syscall name="tgkill" number="270"/> - <syscall name="utimes" number="271"/> - <syscall name="fadvise64_64" number="272"/> - <syscall name="vserver" number="273"/> - <syscall name="mbind" number="274"/> - <syscall name="get_mempolicy" number="275"/> - <syscall name="set_mempolicy" number="276"/> - <syscall name="mq_open" number="277"/> - <syscall name="mq_unlink" number="278"/> - <syscall name="mq_timedsend" number="279"/> - <syscall name="mq_timedreceive" number="280"/> - <syscall name="mq_notify" number="281"/> - <syscall name="mq_getsetattr" number="282"/> - <syscall name="kexec_load" number="283"/> - <syscall name="waitid" number="284"/> - <syscall name="add_key" number="286"/> - <syscall name="request_key" number="287"/> - <syscall name="keyctl" number="288"/> - <syscall name="ioprio_set" number="289"/> - <syscall name="ioprio_get" number="290"/> - <syscall name="inotify_init" number="291"/> - <syscall name="inotify_add_watch" number="292"/> - <syscall name="inotify_rm_watch" number="293"/> - <syscall name="migrate_pages" number="294"/> - <syscall name="openat" number="295"/> - <syscall name="mkdirat" number="296"/> - <syscall name="mknodat" number="297"/> - <syscall name="fchownat" number="298"/> - <syscall name="futimesat" number="299"/> - <syscall name="fstatat64" number="300"/> - <syscall name="unlinkat" number="301"/> - <syscall name="renameat" number="302"/> - <syscall name="linkat" number="303"/> - <syscall name="symlinkat" number="304"/> - <syscall name="readlinkat" number="305"/> - <syscall name="fchmodat" number="306"/> - <syscall name="faccessat" number="307"/> - <syscall name="pselect6" number="308"/> - <syscall name="ppoll" number="309"/> - <syscall name="unshare" number="310"/> - <syscall name="set_robust_list" number="311"/> - <syscall name="get_robust_list" number="312"/> - <syscall name="splice" number="313"/> - <syscall name="sync_file_range" number="314"/> - <syscall name="tee" number="315"/> - <syscall name="vmsplice" number="316"/> - <syscall name="move_pages" number="317"/> - <syscall name="getcpu" number="318"/> - <syscall name="epoll_pwait" number="319"/> - <syscall name="utimensat" number="320"/> - <syscall name="signalfd" number="321"/> - <syscall name="timerfd_create" number="322"/> - <syscall name="eventfd" number="323"/> - <syscall name="fallocate" number="324"/> - <syscall name="timerfd_settime" number="325"/> -</syscalls_info> diff --git a/share/gdb/syscalls/mips-n32-linux.xml b/share/gdb/syscalls/mips-n32-linux.xml deleted file mode 100644 index b4e2181..0000000 --- a/share/gdb/syscalls/mips-n32-linux.xml +++ /dev/null @@ -1,319 +0,0 @@ -<?xml version="1.0"?> -<!-- Copyright (C) 2011-2013 Free Software Foundation, Inc. - - Copying and distribution of this file, with or without modification, - are permitted in any medium without royalty provided the copyright - notice and this notice are preserved. --> - -<!DOCTYPE feature SYSTEM "gdb-syscalls.dtd"> - -<!-- This file was generated using the following file: - - /usr/src/linux/arch/mips/include/asm/unistd.h - - The file mentioned above belongs to the Linux Kernel. --> - -<syscalls_info> - <syscall name="read" number="6000"/> - <syscall name="write" number="6001"/> - <syscall name="open" number="6002"/> - <syscall name="close" number="6003"/> - <syscall name="stat" number="6004"/> - <syscall name="fstat" number="6005"/> - <syscall name="lstat" number="6006"/> - <syscall name="poll" number="6007"/> - <syscall name="lseek" number="6008"/> - <syscall name="mmap" number="6009"/> - <syscall name="mprotect" number="6010"/> - <syscall name="munmap" number="6011"/> - <syscall name="brk" number="6012"/> - <syscall name="rt_sigaction" number="6013"/> - <syscall name="rt_sigprocmask" number="6014"/> - <syscall name="ioctl" number="6015"/> - <syscall name="pread64" number="6016"/> - <syscall name="pwrite64" number="6017"/> - <syscall name="readv" number="6018"/> - <syscall name="writev" number="6019"/> - <syscall name="access" number="6020"/> - <syscall name="pipe" number="6021"/> - <syscall name="_newselect" number="6022"/> - <syscall name="sched_yield" number="6023"/> - <syscall name="mremap" number="6024"/> - <syscall name="msync" number="6025"/> - <syscall name="mincore" number="6026"/> - <syscall name="madvise" number="6027"/> - <syscall name="shmget" number="6028"/> - <syscall name="shmat" number="6029"/> - <syscall name="shmctl" number="6030"/> - <syscall name="dup" number="6031"/> - <syscall name="dup2" number="6032"/> - <syscall name="pause" number="6033"/> - <syscall name="nanosleep" number="6034"/> - <syscall name="getitimer" number="6035"/> - <syscall name="setitimer" number="6036"/> - <syscall name="alarm" number="6037"/> - <syscall name="getpid" number="6038"/> - <syscall name="sendfile" number="6039"/> - <syscall name="socket" number="6040"/> - <syscall name="connect" number="6041"/> - <syscall name="accept" number="6042"/> - <syscall name="sendto" number="6043"/> - <syscall name="recvfrom" number="6044"/> - <syscall name="sendmsg" number="6045"/> - <syscall name="recvmsg" number="6046"/> - <syscall name="shutdown" number="6047"/> - <syscall name="bind" number="6048"/> - <syscall name="listen" number="6049"/> - <syscall name="getsockname" number="6050"/> - <syscall name="getpeername" number="6051"/> - <syscall name="socketpair" number="6052"/> - <syscall name="setsockopt" number="6053"/> - <syscall name="getsockopt" number="6054"/> - <syscall name="clone" number="6055"/> - <syscall name="fork" number="6056"/> - <syscall name="execve" number="6057"/> - <syscall name="exit" number="6058"/> - <syscall name="wait4" number="6059"/> - <syscall name="kill" number="6060"/> - <syscall name="uname" number="6061"/> - <syscall name="semget" number="6062"/> - <syscall name="semop" number="6063"/> - <syscall name="semctl" number="6064"/> - <syscall name="shmdt" number="6065"/> - <syscall name="msgget" number="6066"/> - <syscall name="msgsnd" number="6067"/> - <syscall name="msgrcv" number="6068"/> - <syscall name="msgctl" number="6069"/> - <syscall name="fcntl" number="6070"/> - <syscall name="flock" number="6071"/> - <syscall name="fsync" number="6072"/> - <syscall name="fdatasync" number="6073"/> - <syscall name="truncate" number="6074"/> - <syscall name="ftruncate" number="6075"/> - <syscall name="getdents" number="6076"/> - <syscall name="getcwd" number="6077"/> - <syscall name="chdir" number="6078"/> - <syscall name="fchdir" number="6079"/> - <syscall name="rename" number="6080"/> - <syscall name="mkdir" number="6081"/> - <syscall name="rmdir" number="6082"/> - <syscall name="creat" number="6083"/> - <syscall name="link" number="6084"/> - <syscall name="unlink" number="6085"/> - <syscall name="symlink" number="6086"/> - <syscall name="readlink" number="6087"/> - <syscall name="chmod" number="6088"/> - <syscall name="fchmod" number="6089"/> - <syscall name="chown" number="6090"/> - <syscall name="fchown" number="6091"/> - <syscall name="lchown" number="6092"/> - <syscall name="umask" number="6093"/> - <syscall name="gettimeofday" number="6094"/> - <syscall name="getrlimit" number="6095"/> - <syscall name="getrusage" number="6096"/> - <syscall name="sysinfo" number="6097"/> - <syscall name="times" number="6098"/> - <syscall name="ptrace" number="6099"/> - <syscall name="getuid" number="6100"/> - <syscall name="syslog" number="6101"/> - <syscall name="getgid" number="6102"/> - <syscall name="setuid" number="6103"/> - <syscall name="setgid" number="6104"/> - <syscall name="geteuid" number="6105"/> - <syscall name="getegid" number="6106"/> - <syscall name="setpgid" number="6107"/> - <syscall name="getppid" number="6108"/> - <syscall name="getpgrp" number="6109"/> - <syscall name="setsid" number="6110"/> - <syscall name="setreuid" number="6111"/> - <syscall name="setregid" number="6112"/> - <syscall name="getgroups" number="6113"/> - <syscall name="setgroups" number="6114"/> - <syscall name="setresuid" number="6115"/> - <syscall name="getresuid" number="6116"/> - <syscall name="setresgid" number="6117"/> - <syscall name="getresgid" number="6118"/> - <syscall name="getpgid" number="6119"/> - <syscall name="setfsuid" number="6120"/> - <syscall name="setfsgid" number="6121"/> - <syscall name="getsid" number="6122"/> - <syscall name="capget" number="6123"/> - <syscall name="capset" number="6124"/> - <syscall name="rt_sigpending" number="6125"/> - <syscall name="rt_sigtimedwait" number="6126"/> - <syscall name="rt_sigqueueinfo" number="6127"/> - <syscall name="rt_sigsuspend" number="6128"/> - <syscall name="sigaltstack" number="6129"/> - <syscall name="utime" number="6130"/> - <syscall name="mknod" number="6131"/> - <syscall name="personality" number="6132"/> - <syscall name="ustat" number="6133"/> - <syscall name="statfs" number="6134"/> - <syscall name="fstatfs" number="6135"/> - <syscall name="sysfs" number="6136"/> - <syscall name="getpriority" number="6137"/> - <syscall name="setpriority" number="6138"/> - <syscall name="sched_setparam" number="6139"/> - <syscall name="sched_getparam" number="6140"/> - <syscall name="sched_setscheduler" number="6141"/> - <syscall name="sched_getscheduler" number="6142"/> - <syscall name="sched_get_priority_max" number="6143"/> - <syscall name="sched_get_priority_min" number="6144"/> - <syscall name="sched_rr_get_interval" number="6145"/> - <syscall name="mlock" number="6146"/> - <syscall name="munlock" number="6147"/> - <syscall name="mlockall" number="6148"/> - <syscall name="munlockall" number="6149"/> - <syscall name="vhangup" number="6150"/> - <syscall name="pivot_root" number="6151"/> - <syscall name="_sysctl" number="6152"/> - <syscall name="prctl" number="6153"/> - <syscall name="adjtimex" number="6154"/> - <syscall name="setrlimit" number="6155"/> - <syscall name="chroot" number="6156"/> - <syscall name="sync" number="6157"/> - <syscall name="acct" number="6158"/> - <syscall name="settimeofday" number="6159"/> - <syscall name="mount" number="6160"/> - <syscall name="umount2" number="6161"/> - <syscall name="swapon" number="6162"/> - <syscall name="swapoff" number="6163"/> - <syscall name="reboot" number="6164"/> - <syscall name="sethostname" number="6165"/> - <syscall name="setdomainname" number="6166"/> - <syscall name="create_module" number="6167"/> - <syscall name="init_module" number="6168"/> - <syscall name="delete_module" number="6169"/> - <syscall name="get_kernel_syms" number="6170"/> - <syscall name="query_module" number="6171"/> - <syscall name="quotactl" number="6172"/> - <syscall name="nfsservctl" number="6173"/> - <syscall name="getpmsg" number="6174"/> - <syscall name="putpmsg" number="6175"/> - <syscall name="afs_syscall" number="6176"/> - <syscall name="reserved177" number="6177"/> - <syscall name="gettid" number="6178"/> - <syscall name="readahead" number="6179"/> - <syscall name="setxattr" number="6180"/> - <syscall name="lsetxattr" number="6181"/> - <syscall name="fsetxattr" number="6182"/> - <syscall name="getxattr" number="6183"/> - <syscall name="lgetxattr" number="6184"/> - <syscall name="fgetxattr" number="6185"/> - <syscall name="listxattr" number="6186"/> - <syscall name="llistxattr" number="6187"/> - <syscall name="flistxattr" number="6188"/> - <syscall name="removexattr" number="6189"/> - <syscall name="lremovexattr" number="6190"/> - <syscall name="fremovexattr" number="6191"/> - <syscall name="tkill" number="6192"/> - <syscall name="reserved193" number="6193"/> - <syscall name="futex" number="6194"/> - <syscall name="sched_setaffinity" number="6195"/> - <syscall name="sched_getaffinity" number="6196"/> - <syscall name="cacheflush" number="6197"/> - <syscall name="cachectl" number="6198"/> - <syscall name="sysmips" number="6199"/> - <syscall name="io_setup" number="6200"/> - <syscall name="io_destroy" number="6201"/> - <syscall name="io_getevents" number="6202"/> - <syscall name="io_submit" number="6203"/> - <syscall name="io_cancel" number="6204"/> - <syscall name="exit_group" number="6205"/> - <syscall name="lookup_dcookie" number="6206"/> - <syscall name="epoll_create" number="6207"/> - <syscall name="epoll_ctl" number="6208"/> - <syscall name="epoll_wait" number="6209"/> - <syscall name="remap_file_pages" number="6210"/> - <syscall name="rt_sigreturn" number="6211"/> - <syscall name="fcntl64" number="6212"/> - <syscall name="set_tid_address" number="6213"/> - <syscall name="restart_syscall" number="6214"/> - <syscall name="semtimedop" number="6215"/> - <syscall name="fadvise64" number="6216"/> - <syscall name="statfs64" number="6217"/> - <syscall name="fstatfs64" number="6218"/> - <syscall name="sendfile64" number="6219"/> - <syscall name="timer_create" number="6220"/> - <syscall name="timer_settime" number="6221"/> - <syscall name="timer_gettime" number="6222"/> - <syscall name="timer_getoverrun" number="6223"/> - <syscall name="timer_delete" number="6224"/> - <syscall name="clock_settime" number="6225"/> - <syscall name="clock_gettime" number="6226"/> - <syscall name="clock_getres" number="6227"/> - <syscall name="clock_nanosleep" number="6228"/> - <syscall name="tgkill" number="6229"/> - <syscall name="utimes" number="6230"/> - <syscall name="mbind" number="6231"/> - <syscall name="get_mempolicy" number="6232"/> - <syscall name="set_mempolicy" number="6233"/> - <syscall name="mq_open" number="6234"/> - <syscall name="mq_unlink" number="6235"/> - <syscall name="mq_timedsend" number="6236"/> - <syscall name="mq_timedreceive" number="6237"/> - <syscall name="mq_notify" number="6238"/> - <syscall name="mq_getsetattr" number="6239"/> - <syscall name="vserver" number="6240"/> - <syscall name="waitid" number="6241"/> - <syscall name="add_key" number="6243"/> - <syscall name="request_key" number="6244"/> - <syscall name="keyctl" number="6245"/> - <syscall name="set_thread_area" number="6246"/> - <syscall name="inotify_init" number="6247"/> - <syscall name="inotify_add_watch" number="6248"/> - <syscall name="inotify_rm_watch" number="6249"/> - <syscall name="migrate_pages" number="6250"/> - <syscall name="openat" number="6251"/> - <syscall name="mkdirat" number="6252"/> - <syscall name="mknodat" number="6253"/> - <syscall name="fchownat" number="6254"/> - <syscall name="futimesat" number="6255"/> - <syscall name="newfstatat" number="6256"/> - <syscall name="unlinkat" number="6257"/> - <syscall name="renameat" number="6258"/> - <syscall name="linkat" number="6259"/> - <syscall name="symlinkat" number="6260"/> - <syscall name="readlinkat" number="6261"/> - <syscall name="fchmodat" number="6262"/> - <syscall name="faccessat" number="6263"/> - <syscall name="pselect6" number="6264"/> - <syscall name="ppoll" number="6265"/> - <syscall name="unshare" number="6266"/> - <syscall name="splice" number="6267"/> - <syscall name="sync_file_range" number="6268"/> - <syscall name="tee" number="6269"/> - <syscall name="vmsplice" number="6270"/> - <syscall name="move_pages" number="6271"/> - <syscall name="set_robust_list" number="6272"/> - <syscall name="get_robust_list" number="6273"/> - <syscall name="kexec_load" number="6274"/> - <syscall name="getcpu" number="6275"/> - <syscall name="epoll_pwait" number="6276"/> - <syscall name="ioprio_set" number="6277"/> - <syscall name="ioprio_get" number="6278"/> - <syscall name="utimensat" number="6279"/> - <syscall name="signalfd" number="6280"/> - <syscall name="timerfd" number="6281"/> - <syscall name="eventfd" number="6282"/> - <syscall name="fallocate" number="6283"/> - <syscall name="timerfd_create" number="6284"/> - <syscall name="timerfd_gettime" number="6285"/> - <syscall name="timerfd_settime" number="6286"/> - <syscall name="signalfd4" number="6287"/> - <syscall name="eventfd2" number="6288"/> - <syscall name="epoll_create1" number="6289"/> - <syscall name="dup3" number="6290"/> - <syscall name="pipe2" number="6291"/> - <syscall name="inotify_init1" number="6292"/> - <syscall name="preadv" number="6293"/> - <syscall name="pwritev" number="6294"/> - <syscall name="rt_tgsigqueueinfo" number="6295"/> - <syscall name="perf_event_open" number="6296"/> - <syscall name="accept4" number="6297"/> - <syscall name="recvmmsg" number="6298"/> - <syscall name="getdents64" number="6299"/> - <syscall name="fanotify_init" number="6300"/> - <syscall name="fanotify_mark" number="6301"/> - <syscall name="prlimit64" number="6302"/> -</syscalls_info> diff --git a/share/gdb/syscalls/mips-n64-linux.xml b/share/gdb/syscalls/mips-n64-linux.xml deleted file mode 100644 index 896e0c0..0000000 --- a/share/gdb/syscalls/mips-n64-linux.xml +++ /dev/null @@ -1,312 +0,0 @@ -<?xml version="1.0"?> -<!-- Copyright (C) 2011-2013 Free Software Foundation, Inc. - - Copying and distribution of this file, with or without modification, - are permitted in any medium without royalty provided the copyright - notice and this notice are preserved. --> - -<!DOCTYPE feature SYSTEM "gdb-syscalls.dtd"> - -<!-- This file was generated using the following file: - - /usr/src/linux/arch/mips/include/asm/unistd.h - - The file mentioned above belongs to the Linux Kernel. --> - -<syscalls_info> - <syscall name="read" number="5000"/> - <syscall name="write" number="5001"/> - <syscall name="open" number="5002"/> - <syscall name="close" number="5003"/> - <syscall name="stat" number="5004"/> - <syscall name="fstat" number="5005"/> - <syscall name="lstat" number="5006"/> - <syscall name="poll" number="5007"/> - <syscall name="lseek" number="5008"/> - <syscall name="mmap" number="5009"/> - <syscall name="mprotect" number="5010"/> - <syscall name="munmap" number="5011"/> - <syscall name="brk" number="5012"/> - <syscall name="rt_sigaction" number="5013"/> - <syscall name="rt_sigprocmask" number="5014"/> - <syscall name="ioctl" number="5015"/> - <syscall name="pread64" number="5016"/> - <syscall name="pwrite64" number="5017"/> - <syscall name="readv" number="5018"/> - <syscall name="writev" number="5019"/> - <syscall name="access" number="5020"/> - <syscall name="pipe" number="5021"/> - <syscall name="_newselect" number="5022"/> - <syscall name="sched_yield" number="5023"/> - <syscall name="mremap" number="5024"/> - <syscall name="msync" number="5025"/> - <syscall name="mincore" number="5026"/> - <syscall name="madvise" number="5027"/> - <syscall name="shmget" number="5028"/> - <syscall name="shmat" number="5029"/> - <syscall name="shmctl" number="5030"/> - <syscall name="dup" number="5031"/> - <syscall name="dup2" number="5032"/> - <syscall name="pause" number="5033"/> - <syscall name="nanosleep" number="5034"/> - <syscall name="getitimer" number="5035"/> - <syscall name="setitimer" number="5036"/> - <syscall name="alarm" number="5037"/> - <syscall name="getpid" number="5038"/> - <syscall name="sendfile" number="5039"/> - <syscall name="socket" number="5040"/> - <syscall name="connect" number="5041"/> - <syscall name="accept" number="5042"/> - <syscall name="sendto" number="5043"/> - <syscall name="recvfrom" number="5044"/> - <syscall name="sendmsg" number="5045"/> - <syscall name="recvmsg" number="5046"/> - <syscall name="shutdown" number="5047"/> - <syscall name="bind" number="5048"/> - <syscall name="listen" number="5049"/> - <syscall name="getsockname" number="5050"/> - <syscall name="getpeername" number="5051"/> - <syscall name="socketpair" number="5052"/> - <syscall name="setsockopt" number="5053"/> - <syscall name="getsockopt" number="5054"/> - <syscall name="clone" number="5055"/> - <syscall name="fork" number="5056"/> - <syscall name="execve" number="5057"/> - <syscall name="exit" number="5058"/> - <syscall name="wait4" number="5059"/> - <syscall name="kill" number="5060"/> - <syscall name="uname" number="5061"/> - <syscall name="semget" number="5062"/> - <syscall name="semop" number="5063"/> - <syscall name="semctl" number="5064"/> - <syscall name="shmdt" number="5065"/> - <syscall name="msgget" number="5066"/> - <syscall name="msgsnd" number="5067"/> - <syscall name="msgrcv" number="5068"/> - <syscall name="msgctl" number="5069"/> - <syscall name="fcntl" number="5070"/> - <syscall name="flock" number="5071"/> - <syscall name="fsync" number="5072"/> - <syscall name="fdatasync" number="5073"/> - <syscall name="truncate" number="5074"/> - <syscall name="ftruncate" number="5075"/> - <syscall name="getdents" number="5076"/> - <syscall name="getcwd" number="5077"/> - <syscall name="chdir" number="5078"/> - <syscall name="fchdir" number="5079"/> - <syscall name="rename" number="5080"/> - <syscall name="mkdir" number="5081"/> - <syscall name="rmdir" number="5082"/> - <syscall name="creat" number="5083"/> - <syscall name="link" number="5084"/> - <syscall name="unlink" number="5085"/> - <syscall name="symlink" number="5086"/> - <syscall name="readlink" number="5087"/> - <syscall name="chmod" number="5088"/> - <syscall name="fchmod" number="5089"/> - <syscall name="chown" number="5090"/> - <syscall name="fchown" number="5091"/> - <syscall name="lchown" number="5092"/> - <syscall name="umask" number="5093"/> - <syscall name="gettimeofday" number="5094"/> - <syscall name="getrlimit" number="5095"/> - <syscall name="getrusage" number="5096"/> - <syscall name="sysinfo" number="5097"/> - <syscall name="times" number="5098"/> - <syscall name="ptrace" number="5099"/> - <syscall name="getuid" number="5100"/> - <syscall name="syslog" number="5101"/> - <syscall name="getgid" number="5102"/> - <syscall name="setuid" number="5103"/> - <syscall name="setgid" number="5104"/> - <syscall name="geteuid" number="5105"/> - <syscall name="getegid" number="5106"/> - <syscall name="setpgid" number="5107"/> - <syscall name="getppid" number="5108"/> - <syscall name="getpgrp" number="5109"/> - <syscall name="setsid" number="5110"/> - <syscall name="setreuid" number="5111"/> - <syscall name="setregid" number="5112"/> - <syscall name="getgroups" number="5113"/> - <syscall name="setgroups" number="5114"/> - <syscall name="setresuid" number="5115"/> - <syscall name="getresuid" number="5116"/> - <syscall name="setresgid" number="5117"/> - <syscall name="getresgid" number="5118"/> - <syscall name="getpgid" number="5119"/> - <syscall name="setfsuid" number="5120"/> - <syscall name="setfsgid" number="5121"/> - <syscall name="getsid" number="5122"/> - <syscall name="capget" number="5123"/> - <syscall name="capset" number="5124"/> - <syscall name="rt_sigpending" number="5125"/> - <syscall name="rt_sigtimedwait" number="5126"/> - <syscall name="rt_sigqueueinfo" number="5127"/> - <syscall name="rt_sigsuspend" number="5128"/> - <syscall name="sigaltstack" number="5129"/> - <syscall name="utime" number="5130"/> - <syscall name="mknod" number="5131"/> - <syscall name="personality" number="5132"/> - <syscall name="ustat" number="5133"/> - <syscall name="statfs" number="5134"/> - <syscall name="fstatfs" number="5135"/> - <syscall name="sysfs" number="5136"/> - <syscall name="getpriority" number="5137"/> - <syscall name="setpriority" number="5138"/> - <syscall name="sched_setparam" number="5139"/> - <syscall name="sched_getparam" number="5140"/> - <syscall name="sched_setscheduler" number="5141"/> - <syscall name="sched_getscheduler" number="5142"/> - <syscall name="sched_get_priority_max" number="5143"/> - <syscall name="sched_get_priority_min" number="5144"/> - <syscall name="sched_rr_get_interval" number="5145"/> - <syscall name="mlock" number="5146"/> - <syscall name="munlock" number="5147"/> - <syscall name="mlockall" number="5148"/> - <syscall name="munlockall" number="5149"/> - <syscall name="vhangup" number="5150"/> - <syscall name="pivot_root" number="5151"/> - <syscall name="_sysctl" number="5152"/> - <syscall name="prctl" number="5153"/> - <syscall name="adjtimex" number="5154"/> - <syscall name="setrlimit" number="5155"/> - <syscall name="chroot" number="5156"/> - <syscall name="sync" number="5157"/> - <syscall name="acct" number="5158"/> - <syscall name="settimeofday" number="5159"/> - <syscall name="mount" number="5160"/> - <syscall name="umount2" number="5161"/> - <syscall name="swapon" number="5162"/> - <syscall name="swapoff" number="5163"/> - <syscall name="reboot" number="5164"/> - <syscall name="sethostname" number="5165"/> - <syscall name="setdomainname" number="5166"/> - <syscall name="create_module" number="5167"/> - <syscall name="init_module" number="5168"/> - <syscall name="delete_module" number="5169"/> - <syscall name="get_kernel_syms" number="5170"/> - <syscall name="query_module" number="5171"/> - <syscall name="quotactl" number="5172"/> - <syscall name="nfsservctl" number="5173"/> - <syscall name="getpmsg" number="5174"/> - <syscall name="putpmsg" number="5175"/> - <syscall name="afs_syscall" number="5176"/> - <syscall name="gettid" number="5178"/> - <syscall name="readahead" number="5179"/> - <syscall name="setxattr" number="5180"/> - <syscall name="lsetxattr" number="5181"/> - <syscall name="fsetxattr" number="5182"/> - <syscall name="getxattr" number="5183"/> - <syscall name="lgetxattr" number="5184"/> - <syscall name="fgetxattr" number="5185"/> - <syscall name="listxattr" number="5186"/> - <syscall name="llistxattr" number="5187"/> - <syscall name="flistxattr" number="5188"/> - <syscall name="removexattr" number="5189"/> - <syscall name="lremovexattr" number="5190"/> - <syscall name="fremovexattr" number="5191"/> - <syscall name="tkill" number="5192"/> - <syscall name="futex" number="5194"/> - <syscall name="sched_setaffinity" number="5195"/> - <syscall name="sched_getaffinity" number="5196"/> - <syscall name="cacheflush" number="5197"/> - <syscall name="cachectl" number="5198"/> - <syscall name="sysmips" number="5199"/> - <syscall name="io_setup" number="5200"/> - <syscall name="io_destroy" number="5201"/> - <syscall name="io_getevents" number="5202"/> - <syscall name="io_submit" number="5203"/> - <syscall name="io_cancel" number="5204"/> - <syscall name="exit_group" number="5205"/> - <syscall name="lookup_dcookie" number="5206"/> - <syscall name="epoll_create" number="5207"/> - <syscall name="epoll_ctl" number="5208"/> - <syscall name="epoll_wait" number="5209"/> - <syscall name="remap_file_pages" number="5210"/> - <syscall name="rt_sigreturn" number="5211"/> - <syscall name="set_tid_address" number="5212"/> - <syscall name="restart_syscall" number="5213"/> - <syscall name="semtimedop" number="5214"/> - <syscall name="fadvise64" number="5215"/> - <syscall name="timer_create" number="5216"/> - <syscall name="timer_settime" number="5217"/> - <syscall name="timer_gettime" number="5218"/> - <syscall name="timer_getoverrun" number="5219"/> - <syscall name="timer_delete" number="5220"/> - <syscall name="clock_settime" number="5221"/> - <syscall name="clock_gettime" number="5222"/> - <syscall name="clock_getres" number="5223"/> - <syscall name="clock_nanosleep" number="5224"/> - <syscall name="tgkill" number="5225"/> - <syscall name="utimes" number="5226"/> - <syscall name="mbind" number="5227"/> - <syscall name="get_mempolicy" number="5228"/> - <syscall name="set_mempolicy" number="5229"/> - <syscall name="mq_open" number="5230"/> - <syscall name="mq_unlink" number="5231"/> - <syscall name="mq_timedsend" number="5232"/> - <syscall name="mq_timedreceive" number="5233"/> - <syscall name="mq_notify" number="5234"/> - <syscall name="mq_getsetattr" number="5235"/> - <syscall name="vserver" number="5236"/> - <syscall name="waitid" number="5237"/> - <syscall name="add_key" number="5239"/> - <syscall name="request_key" number="5240"/> - <syscall name="keyctl" number="5241"/> - <syscall name="set_thread_area" number="5242"/> - <syscall name="inotify_init" number="5243"/> - <syscall name="inotify_add_watch" number="5244"/> - <syscall name="inotify_rm_watch" number="5245"/> - <syscall name="migrate_pages" number="5246"/> - <syscall name="openat" number="5247"/> - <syscall name="mkdirat" number="5248"/> - <syscall name="mknodat" number="5249"/> - <syscall name="fchownat" number="5250"/> - <syscall name="futimesat" number="5251"/> - <syscall name="newfstatat" number="5252"/> - <syscall name="unlinkat" number="5253"/> - <syscall name="renameat" number="5254"/> - <syscall name="linkat" number="5255"/> - <syscall name="symlinkat" number="5256"/> - <syscall name="readlinkat" number="5257"/> - <syscall name="fchmodat" number="5258"/> - <syscall name="faccessat" number="5259"/> - <syscall name="pselect6" number="5260"/> - <syscall name="ppoll" number="5261"/> - <syscall name="unshare" number="5262"/> - <syscall name="splice" number="5263"/> - <syscall name="sync_file_range" number="5264"/> - <syscall name="tee" number="5265"/> - <syscall name="vmsplice" number="5266"/> - <syscall name="move_pages" number="5267"/> - <syscall name="set_robust_list" number="5268"/> - <syscall name="get_robust_list" number="5269"/> - <syscall name="kexec_load" number="5270"/> - <syscall name="getcpu" number="5271"/> - <syscall name="epoll_pwait" number="5272"/> - <syscall name="ioprio_set" number="5273"/> - <syscall name="ioprio_get" number="5274"/> - <syscall name="utimensat" number="5275"/> - <syscall name="signalfd" number="5276"/> - <syscall name="timerfd" number="5277"/> - <syscall name="eventfd" number="5278"/> - <syscall name="fallocate" number="5279"/> - <syscall name="timerfd_create" number="5280"/> - <syscall name="timerfd_gettime" number="5281"/> - <syscall name="timerfd_settime" number="5282"/> - <syscall name="signalfd4" number="5283"/> - <syscall name="eventfd2" number="5284"/> - <syscall name="epoll_create1" number="5285"/> - <syscall name="dup3" number="5286"/> - <syscall name="pipe2" number="5287"/> - <syscall name="inotify_init1" number="5288"/> - <syscall name="preadv" number="5289"/> - <syscall name="pwritev" number="5290"/> - <syscall name="rt_tgsigqueueinfo" number="5291"/> - <syscall name="perf_event_open" number="5292"/> - <syscall name="accept4" number="5293"/> - <syscall name="recvmmsg" number="5294"/> - <syscall name="fanotify_init" number="5295"/> - <syscall name="fanotify_mark" number="5296"/> - <syscall name="prlimit64" number="5297"/> -</syscalls_info> diff --git a/share/gdb/syscalls/mips-o32-linux.xml b/share/gdb/syscalls/mips-o32-linux.xml deleted file mode 100644 index 2b11247..0000000 --- a/share/gdb/syscalls/mips-o32-linux.xml +++ /dev/null @@ -1,347 +0,0 @@ -<?xml version="1.0"?> -<!-- Copyright (C) 2011-2013 Free Software Foundation, Inc. - - Copying and distribution of this file, with or without modification, - are permitted in any medium without royalty provided the copyright - notice and this notice are preserved. --> - -<!DOCTYPE feature SYSTEM "gdb-syscalls.dtd"> - -<!-- This file was generated using the following file: - - /usr/src/linux/arch/mips/include/asm/unistd.h - - The file mentioned above belongs to the Linux Kernel. --> - -<syscalls_info> - <syscall name="syscall" number="4000"/> - <syscall name="exit" number="4001"/> - <syscall name="fork" number="4002"/> - <syscall name="read" number="4003"/> - <syscall name="write" number="4004"/> - <syscall name="open" number="4005"/> - <syscall name="close" number="4006"/> - <syscall name="waitpid" number="4007"/> - <syscall name="creat" number="4008"/> - <syscall name="link" number="4009"/> - <syscall name="unlink" number="4010"/> - <syscall name="execve" number="4011"/> - <syscall name="chdir" number="4012"/> - <syscall name="time" number="4013"/> - <syscall name="mknod" number="4014"/> - <syscall name="chmod" number="4015"/> - <syscall name="lchown" number="4016"/> - <syscall name="break" number="4017"/> - <syscall name="lseek" number="4019"/> - <syscall name="getpid" number="4020"/> - <syscall name="mount" number="4021"/> - <syscall name="umount" number="4022"/> - <syscall name="setuid" number="4023"/> - <syscall name="getuid" number="4024"/> - <syscall name="stime" number="4025"/> - <syscall name="ptrace" number="4026"/> - <syscall name="alarm" number="4027"/> - <syscall name="pause" number="4029"/> - <syscall name="utime" number="4030"/> - <syscall name="stty" number="4031"/> - <syscall name="gtty" number="4032"/> - <syscall name="access" number="4033"/> - <syscall name="nice" number="4034"/> - <syscall name="ftime" number="4035"/> - <syscall name="sync" number="4036"/> - <syscall name="kill" number="4037"/> - <syscall name="rename" number="4038"/> - <syscall name="mkdir" number="4039"/> - <syscall name="rmdir" number="4040"/> - <syscall name="dup" number="4041"/> - <syscall name="pipe" number="4042"/> - <syscall name="times" number="4043"/> - <syscall name="prof" number="4044"/> - <syscall name="brk" number="4045"/> - <syscall name="setgid" number="4046"/> - <syscall name="getgid" number="4047"/> - <syscall name="signal" number="4048"/> - <syscall name="geteuid" number="4049"/> - <syscall name="getegid" number="4050"/> - <syscall name="acct" number="4051"/> - <syscall name="umount2" number="4052"/> - <syscall name="lock" number="4053"/> - <syscall name="ioctl" number="4054"/> - <syscall name="fcntl" number="4055"/> - <syscall name="mpx" number="4056"/> - <syscall name="setpgid" number="4057"/> - <syscall name="ulimit" number="4058"/> - <syscall name="umask" number="4060"/> - <syscall name="chroot" number="4061"/> - <syscall name="ustat" number="4062"/> - <syscall name="dup2" number="4063"/> - <syscall name="getppid" number="4064"/> - <syscall name="getpgrp" number="4065"/> - <syscall name="setsid" number="4066"/> - <syscall name="sigaction" number="4067"/> - <syscall name="sgetmask" number="4068"/> - <syscall name="ssetmask" number="4069"/> - <syscall name="setreuid" number="4070"/> - <syscall name="setregid" number="4071"/> - <syscall name="sigsuspend" number="4072"/> - <syscall name="sigpending" number="4073"/> - <syscall name="sethostname" number="4074"/> - <syscall name="setrlimit" number="4075"/> - <syscall name="getrlimit" number="4076"/> - <syscall name="getrusage" number="4077"/> - <syscall name="gettimeofday" number="4078"/> - <syscall name="settimeofday" number="4079"/> - <syscall name="getgroups" number="4080"/> - <syscall name="setgroups" number="4081"/> - <syscall name="symlink" number="4083"/> - <syscall name="readlink" number="4085"/> - <syscall name="uselib" number="4086"/> - <syscall name="swapon" number="4087"/> - <syscall name="reboot" number="4088"/> - <syscall name="readdir" number="4089"/> - <syscall name="mmap" number="4090"/> - <syscall name="munmap" number="4091"/> - <syscall name="truncate" number="4092"/> - <syscall name="ftruncate" number="4093"/> - <syscall name="fchmod" number="4094"/> - <syscall name="fchown" number="4095"/> - <syscall name="getpriority" number="4096"/> - <syscall name="setpriority" number="4097"/> - <syscall name="profil" number="4098"/> - <syscall name="statfs" number="4099"/> - <syscall name="fstatfs" number="4100"/> - <syscall name="ioperm" number="4101"/> - <syscall name="socketcall" number="4102"/> - <syscall name="syslog" number="4103"/> - <syscall name="setitimer" number="4104"/> - <syscall name="getitimer" number="4105"/> - <syscall name="stat" number="4106"/> - <syscall name="lstat" number="4107"/> - <syscall name="fstat" number="4108"/> - <syscall name="iopl" number="4110"/> - <syscall name="vhangup" number="4111"/> - <syscall name="idle" number="4112"/> - <syscall name="vm86" number="4113"/> - <syscall name="wait4" number="4114"/> - <syscall name="swapoff" number="4115"/> - <syscall name="sysinfo" number="4116"/> - <syscall name="ipc" number="4117"/> - <syscall name="fsync" number="4118"/> - <syscall name="sigreturn" number="4119"/> - <syscall name="clone" number="4120"/> - <syscall name="setdomainname" number="4121"/> - <syscall name="uname" number="4122"/> - <syscall name="modify_ldt" number="4123"/> - <syscall name="adjtimex" number="4124"/> - <syscall name="mprotect" number="4125"/> - <syscall name="sigprocmask" number="4126"/> - <syscall name="create_module" number="4127"/> - <syscall name="init_module" number="4128"/> - <syscall name="delete_module" number="4129"/> - <syscall name="get_kernel_syms" number="4130"/> - <syscall name="quotactl" number="4131"/> - <syscall name="getpgid" number="4132"/> - <syscall name="fchdir" number="4133"/> - <syscall name="bdflush" number="4134"/> - <syscall name="sysfs" number="4135"/> - <syscall name="personality" number="4136"/> - <syscall name="afs_syscall" number="4137"/> - <syscall name="setfsuid" number="4138"/> - <syscall name="setfsgid" number="4139"/> - <syscall name="_llseek" number="4140"/> - <syscall name="getdents" number="4141"/> - <syscall name="_newselect" number="4142"/> - <syscall name="flock" number="4143"/> - <syscall name="msync" number="4144"/> - <syscall name="readv" number="4145"/> - <syscall name="writev" number="4146"/> - <syscall name="cacheflush" number="4147"/> - <syscall name="cachectl" number="4148"/> - <syscall name="sysmips" number="4149"/> - <syscall name="getsid" number="4151"/> - <syscall name="fdatasync" number="4152"/> - <syscall name="_sysctl" number="4153"/> - <syscall name="mlock" number="4154"/> - <syscall name="munlock" number="4155"/> - <syscall name="mlockall" number="4156"/> - <syscall name="munlockall" number="4157"/> - <syscall name="sched_setparam" number="4158"/> - <syscall name="sched_getparam" number="4159"/> - <syscall name="sched_setscheduler" number="4160"/> - <syscall name="sched_getscheduler" number="4161"/> - <syscall name="sched_yield" number="4162"/> - <syscall name="sched_get_priority_max" number="4163"/> - <syscall name="sched_get_priority_min" number="4164"/> - <syscall name="sched_rr_get_interval" number="4165"/> - <syscall name="nanosleep" number="4166"/> - <syscall name="mremap" number="4167"/> - <syscall name="accept" number="4168"/> - <syscall name="bind" number="4169"/> - <syscall name="connect" number="4170"/> - <syscall name="getpeername" number="4171"/> - <syscall name="getsockname" number="4172"/> - <syscall name="getsockopt" number="4173"/> - <syscall name="listen" number="4174"/> - <syscall name="recv" number="4175"/> - <syscall name="recvfrom" number="4176"/> - <syscall name="recvmsg" number="4177"/> - <syscall name="send" number="4178"/> - <syscall name="sendmsg" number="4179"/> - <syscall name="sendto" number="4180"/> - <syscall name="setsockopt" number="4181"/> - <syscall name="shutdown" number="4182"/> - <syscall name="socket" number="4183"/> - <syscall name="socketpair" number="4184"/> - <syscall name="setresuid" number="4185"/> - <syscall name="getresuid" number="4186"/> - <syscall name="query_module" number="4187"/> - <syscall name="poll" number="4188"/> - <syscall name="nfsservctl" number="4189"/> - <syscall name="setresgid" number="4190"/> - <syscall name="getresgid" number="4191"/> - <syscall name="prctl" number="4192"/> - <syscall name="rt_sigreturn" number="4193"/> - <syscall name="rt_sigaction" number="4194"/> - <syscall name="rt_sigprocmask" number="4195"/> - <syscall name="rt_sigpending" number="4196"/> - <syscall name="rt_sigtimedwait" number="4197"/> - <syscall name="rt_sigqueueinfo" number="4198"/> - <syscall name="rt_sigsuspend" number="4199"/> - <syscall name="pread64" number="4200"/> - <syscall name="pwrite64" number="4201"/> - <syscall name="chown" number="4202"/> - <syscall name="getcwd" number="4203"/> - <syscall name="capget" number="4204"/> - <syscall name="capset" number="4205"/> - <syscall name="sigaltstack" number="4206"/> - <syscall name="sendfile" number="4207"/> - <syscall name="getpmsg" number="4208"/> - <syscall name="putpmsg" number="4209"/> - <syscall name="mmap2" number="4210"/> - <syscall name="truncate64" number="4211"/> - <syscall name="ftruncate64" number="4212"/> - <syscall name="stat64" number="4213"/> - <syscall name="lstat64" number="4214"/> - <syscall name="fstat64" number="4215"/> - <syscall name="pivot_root" number="4216"/> - <syscall name="mincore" number="4217"/> - <syscall name="madvise" number="4218"/> - <syscall name="getdents64" number="4219"/> - <syscall name="fcntl64" number="4220"/> - <syscall name="gettid" number="4222"/> - <syscall name="readahead" number="4223"/> - <syscall name="setxattr" number="4224"/> - <syscall name="lsetxattr" number="4225"/> - <syscall name="fsetxattr" number="4226"/> - <syscall name="getxattr" number="4227"/> - <syscall name="lgetxattr" number="4228"/> - <syscall name="fgetxattr" number="4229"/> - <syscall name="listxattr" number="4230"/> - <syscall name="llistxattr" number="4231"/> - <syscall name="flistxattr" number="4232"/> - <syscall name="removexattr" number="4233"/> - <syscall name="lremovexattr" number="4234"/> - <syscall name="fremovexattr" number="4235"/> - <syscall name="tkill" number="4236"/> - <syscall name="sendfile64" number="4237"/> - <syscall name="futex" number="4238"/> - <syscall name="sched_setaffinity" number="4239"/> - <syscall name="sched_getaffinity" number="4240"/> - <syscall name="io_setup" number="4241"/> - <syscall name="io_destroy" number="4242"/> - <syscall name="io_getevents" number="4243"/> - <syscall name="io_submit" number="4244"/> - <syscall name="io_cancel" number="4245"/> - <syscall name="exit_group" number="4246"/> - <syscall name="lookup_dcookie" number="4247"/> - <syscall name="epoll_create" number="4248"/> - <syscall name="epoll_ctl" number="4249"/> - <syscall name="epoll_wait" number="4250"/> - <syscall name="remap_file_pages" number="4251"/> - <syscall name="set_tid_address" number="4252"/> - <syscall name="restart_syscall" number="4253"/> - <syscall name="fadvise64" number="4254"/> - <syscall name="statfs64" number="4255"/> - <syscall name="fstatfs64" number="4256"/> - <syscall name="timer_create" number="4257"/> - <syscall name="timer_settime" number="4258"/> - <syscall name="timer_gettime" number="4259"/> - <syscall name="timer_getoverrun" number="4260"/> - <syscall name="timer_delete" number="4261"/> - <syscall name="clock_settime" number="4262"/> - <syscall name="clock_gettime" number="4263"/> - <syscall name="clock_getres" number="4264"/> - <syscall name="clock_nanosleep" number="4265"/> - <syscall name="tgkill" number="4266"/> - <syscall name="utimes" number="4267"/> - <syscall name="mbind" number="4268"/> - <syscall name="get_mempolicy" number="4269"/> - <syscall name="set_mempolicy" number="4270"/> - <syscall name="mq_open" number="4271"/> - <syscall name="mq_unlink" number="4272"/> - <syscall name="mq_timedsend" number="4273"/> - <syscall name="mq_timedreceive" number="4274"/> - <syscall name="mq_notify" number="4275"/> - <syscall name="mq_getsetattr" number="4276"/> - <syscall name="vserver" number="4277"/> - <syscall name="waitid" number="4278"/> - <syscall name="add_key" number="4280"/> - <syscall name="request_key" number="4281"/> - <syscall name="keyctl" number="4282"/> - <syscall name="set_thread_area" number="4283"/> - <syscall name="inotify_init" number="4284"/> - <syscall name="inotify_add_watch" number="4285"/> - <syscall name="inotify_rm_watch" number="4286"/> - <syscall name="migrate_pages" number="4287"/> - <syscall name="openat" number="4288"/> - <syscall name="mkdirat" number="4289"/> - <syscall name="mknodat" number="4290"/> - <syscall name="fchownat" number="4291"/> - <syscall name="futimesat" number="4292"/> - <syscall name="fstatat64" number="4293"/> - <syscall name="unlinkat" number="4294"/> - <syscall name="renameat" number="4295"/> - <syscall name="linkat" number="4296"/> - <syscall name="symlinkat" number="4297"/> - <syscall name="readlinkat" number="4298"/> - <syscall name="fchmodat" number="4299"/> - <syscall name="faccessat" number="4300"/> - <syscall name="pselect6" number="4301"/> - <syscall name="ppoll" number="4302"/> - <syscall name="unshare" number="4303"/> - <syscall name="splice" number="4304"/> - <syscall name="sync_file_range" number="4305"/> - <syscall name="tee" number="4306"/> - <syscall name="vmsplice" number="4307"/> - <syscall name="move_pages" number="4308"/> - <syscall name="set_robust_list" number="4309"/> - <syscall name="get_robust_list" number="4310"/> - <syscall name="kexec_load" number="4311"/> - <syscall name="getcpu" number="4312"/> - <syscall name="epoll_pwait" number="4313"/> - <syscall name="ioprio_set" number="4314"/> - <syscall name="ioprio_get" number="4315"/> - <syscall name="utimensat" number="4316"/> - <syscall name="signalfd" number="4317"/> - <syscall name="timerfd" number="4318"/> - <syscall name="eventfd" number="4319"/> - <syscall name="fallocate" number="4320"/> - <syscall name="timerfd_create" number="4321"/> - <syscall name="timerfd_gettime" number="4322"/> - <syscall name="timerfd_settime" number="4323"/> - <syscall name="signalfd4" number="4324"/> - <syscall name="eventfd2" number="4325"/> - <syscall name="epoll_create1" number="4326"/> - <syscall name="dup3" number="4327"/> - <syscall name="pipe2" number="4328"/> - <syscall name="inotify_init1" number="4329"/> - <syscall name="preadv" number="4330"/> - <syscall name="pwritev" number="4331"/> - <syscall name="rt_tgsigqueueinfo" number="4332"/> - <syscall name="perf_event_open" number="4333"/> - <syscall name="accept4" number="4334"/> - <syscall name="recvmmsg" number="4335"/> - <syscall name="fanotify_init" number="4336"/> - <syscall name="fanotify_mark" number="4337"/> - <syscall name="prlimit64" number="4338"/> -</syscalls_info> diff --git a/share/gdb/syscalls/ppc-linux.xml b/share/gdb/syscalls/ppc-linux.xml deleted file mode 100644 index dd4eba6..0000000 --- a/share/gdb/syscalls/ppc-linux.xml +++ /dev/null @@ -1,310 +0,0 @@ -<?xml version="1.0"?> -<!-- Copyright (C) 2009-2013 Free Software Foundation, Inc. - - Copying and distribution of this file, with or without modification, - are permitted in any medium without royalty provided the copyright - notice and this notice are preserved. --> - -<!DOCTYPE feature SYSTEM "gdb-syscalls.dtd"> - -<!-- This file was generated using the following file: - - /usr/src/linux/arch/powerpc/include/asm/unistd.h - - The file mentioned above belongs to the Linux Kernel. --> - -<syscalls_info> - <syscall name="restart_syscall" number="0"/> - <syscall name="exit" number="1"/> - <syscall name="fork" number="2"/> - <syscall name="read" number="3"/> - <syscall name="write" number="4"/> - <syscall name="open" number="5"/> - <syscall name="close" number="6"/> - <syscall name="waitpid" number="7"/> - <syscall name="creat" number="8"/> - <syscall name="link" number="9"/> - <syscall name="unlink" number="10"/> - <syscall name="execve" number="11"/> - <syscall name="chdir" number="12"/> - <syscall name="time" number="13"/> - <syscall name="mknod" number="14"/> - <syscall name="chmod" number="15"/> - <syscall name="lchown" number="16"/> - <syscall name="break" number="17"/> - <syscall name="oldstat" number="18"/> - <syscall name="lseek" number="19"/> - <syscall name="getpid" number="20"/> - <syscall name="mount" number="21"/> - <syscall name="umount" number="22"/> - <syscall name="setuid" number="23"/> - <syscall name="getuid" number="24"/> - <syscall name="stime" number="25"/> - <syscall name="ptrace" number="26"/> - <syscall name="alarm" number="27"/> - <syscall name="oldfstat" number="28"/> - <syscall name="pause" number="29"/> - <syscall name="utime" number="30"/> - <syscall name="stty" number="31"/> - <syscall name="gtty" number="32"/> - <syscall name="access" number="33"/> - <syscall name="nice" number="34"/> - <syscall name="ftime" number="35"/> - <syscall name="sync" number="36"/> - <syscall name="kill" number="37"/> - <syscall name="rename" number="38"/> - <syscall name="mkdir" number="39"/> - <syscall name="rmdir" number="40"/> - <syscall name="dup" number="41"/> - <syscall name="pipe" number="42"/> - <syscall name="times" number="43"/> - <syscall name="prof" number="44"/> - <syscall name="brk" number="45"/> - <syscall name="setgid" number="46"/> - <syscall name="getgid" number="47"/> - <syscall name="signal" number="48"/> - <syscall name="geteuid" number="49"/> - <syscall name="getegid" number="50"/> - <syscall name="acct" number="51"/> - <syscall name="umount2" number="52"/> - <syscall name="lock" number="53"/> - <syscall name="ioctl" number="54"/> - <syscall name="fcntl" number="55"/> - <syscall name="mpx" number="56"/> - <syscall name="setpgid" number="57"/> - <syscall name="ulimit" number="58"/> - <syscall name="oldolduname" number="59"/> - <syscall name="umask" number="60"/> - <syscall name="chroot" number="61"/> - <syscall name="ustat" number="62"/> - <syscall name="dup2" number="63"/> - <syscall name="getppid" number="64"/> - <syscall name="getpgrp" number="65"/> - <syscall name="setsid" number="66"/> - <syscall name="sigaction" number="67"/> - <syscall name="sgetmask" number="68"/> - <syscall name="ssetmask" number="69"/> - <syscall name="setreuid" number="70"/> - <syscall name="setregid" number="71"/> - <syscall name="sigsuspend" number="72"/> - <syscall name="sigpending" number="73"/> - <syscall name="sethostname" number="74"/> - <syscall name="setrlimit" number="75"/> - <syscall name="getrlimit" number="76"/> - <syscall name="getrusage" number="77"/> - <syscall name="gettimeofday" number="78"/> - <syscall name="settimeofday" number="79"/> - <syscall name="getgroups" number="80"/> - <syscall name="setgroups" number="81"/> - <syscall name="select" number="82"/> - <syscall name="symlink" number="83"/> - <syscall name="oldlstat" number="84"/> - <syscall name="readlink" number="85"/> - <syscall name="uselib" number="86"/> - <syscall name="swapon" number="87"/> - <syscall name="reboot" number="88"/> - <syscall name="readdir" number="89"/> - <syscall name="mmap" number="90"/> - <syscall name="munmap" number="91"/> - <syscall name="truncate" number="92"/> - <syscall name="ftruncate" number="93"/> - <syscall name="fchmod" number="94"/> - <syscall name="fchown" number="95"/> - <syscall name="getpriority" number="96"/> - <syscall name="setpriority" number="97"/> - <syscall name="profil" number="98"/> - <syscall name="statfs" number="99"/> - <syscall name="fstatfs" number="100"/> - <syscall name="ioperm" number="101"/> - <syscall name="socketcall" number="102"/> - <syscall name="syslog" number="103"/> - <syscall name="setitimer" number="104"/> - <syscall name="getitimer" number="105"/> - <syscall name="stat" number="106"/> - <syscall name="lstat" number="107"/> - <syscall name="fstat" number="108"/> - <syscall name="olduname" number="109"/> - <syscall name="iopl" number="110"/> - <syscall name="vhangup" number="111"/> - <syscall name="idle" number="112"/> - <syscall name="vm86" number="113"/> - <syscall name="wait4" number="114"/> - <syscall name="swapoff" number="115"/> - <syscall name="sysinfo" number="116"/> - <syscall name="ipc" number="117"/> - <syscall name="fsync" number="118"/> - <syscall name="sigreturn" number="119"/> - <syscall name="clone" number="120"/> - <syscall name="setdomainname" number="121"/> - <syscall name="uname" number="122"/> - <syscall name="modify_ldt" number="123"/> - <syscall name="adjtimex" number="124"/> - <syscall name="mprotect" number="125"/> - <syscall name="sigprocmask" number="126"/> - <syscall name="create_module" number="127"/> - <syscall name="init_module" number="128"/> - <syscall name="delete_module" number="129"/> - <syscall name="get_kernel_syms" number="130"/> - <syscall name="quotactl" number="131"/> - <syscall name="getpgid" number="132"/> - <syscall name="fchdir" number="133"/> - <syscall name="bdflush" number="134"/> - <syscall name="sysfs" number="135"/> - <syscall name="personality" number="136"/> - <syscall name="afs_syscall" number="137"/> - <syscall name="setfsuid" number="138"/> - <syscall name="setfsgid" number="139"/> - <syscall name="_llseek" number="140"/> - <syscall name="getdents" number="141"/> - <syscall name="_newselect" number="142"/> - <syscall name="flock" number="143"/> - <syscall name="msync" number="144"/> - <syscall name="readv" number="145"/> - <syscall name="writev" number="146"/> - <syscall name="getsid" number="147"/> - <syscall name="fdatasync" number="148"/> - <syscall name="_sysctl" number="149"/> - <syscall name="mlock" number="150"/> - <syscall name="munlock" number="151"/> - <syscall name="mlockall" number="152"/> - <syscall name="munlockall" number="153"/> - <syscall name="sched_setparam" number="154"/> - <syscall name="sched_getparam" number="155"/> - <syscall name="sched_setscheduler" number="156"/> - <syscall name="sched_getscheduler" number="157"/> - <syscall name="sched_yield" number="158"/> - <syscall name="sched_get_priority_max" number="159"/> - <syscall name="sched_get_priority_min" number="160"/> - <syscall name="sched_rr_get_interval" number="161"/> - <syscall name="nanosleep" number="162"/> - <syscall name="mremap" number="163"/> - <syscall name="setresuid" number="164"/> - <syscall name="getresuid" number="165"/> - <syscall name="query_module" number="166"/> - <syscall name="poll" number="167"/> - <syscall name="nfsservctl" number="168"/> - <syscall name="setresgid" number="169"/> - <syscall name="getresgid" number="170"/> - <syscall name="prctl" number="171"/> - <syscall name="rt_sigreturn" number="172"/> - <syscall name="rt_sigaction" number="173"/> - <syscall name="rt_sigprocmask" number="174"/> - <syscall name="rt_sigpending" number="175"/> - <syscall name="rt_sigtimedwait" number="176"/> - <syscall name="rt_sigqueueinfo" number="177"/> - <syscall name="rt_sigsuspend" number="178"/> - <syscall name="pread64" number="179"/> - <syscall name="pwrite64" number="180"/> - <syscall name="chown" number="181"/> - <syscall name="getcwd" number="182"/> - <syscall name="capget" number="183"/> - <syscall name="capset" number="184"/> - <syscall name="sigaltstack" number="185"/> - <syscall name="sendfile" number="186"/> - <syscall name="getpmsg" number="187"/> - <syscall name="putpmsg" number="188"/> - <syscall name="vfork" number="189"/> - <syscall name="ugetrlimit" number="190"/> - <syscall name="readahead" number="191"/> - <syscall name="mmap2" number="192"/> - <syscall name="truncate64" number="193"/> - <syscall name="ftruncate64" number="194"/> - <syscall name="stat64" number="195"/> - <syscall name="lstat64" number="196"/> - <syscall name="fstat64" number="197"/> - <syscall name="pciconfig_read" number="198"/> - <syscall name="pciconfig_write" number="199"/> - <syscall name="pciconfig_iobase" number="200"/> - <syscall name="multiplexer" number="201"/> - <syscall name="getdents64" number="202"/> - <syscall name="pivot_root" number="203"/> - <syscall name="fcntl64" number="204"/> - <syscall name="madvise" number="205"/> - <syscall name="mincore" number="206"/> - <syscall name="gettid" number="207"/> - <syscall name="tkill" number="208"/> - <syscall name="setxattr" number="209"/> - <syscall name="lsetxattr" number="210"/> - <syscall name="fsetxattr" number="211"/> - <syscall name="getxattr" number="212"/> - <syscall name="lgetxattr" number="213"/> - <syscall name="fgetxattr" number="214"/> - <syscall name="listxattr" number="215"/> - <syscall name="llistxattr" number="216"/> - <syscall name="flistxattr" number="217"/> - <syscall name="removexattr" number="218"/> - <syscall name="lremovexattr" number="219"/> - <syscall name="fremovexattr" number="220"/> - <syscall name="futex" number="221"/> - <syscall name="sched_setaffinity" number="222"/> - <syscall name="sched_getaffinity" number="223"/> - <syscall name="tuxcall" number="225"/> - <syscall name="sendfile64" number="226"/> - <syscall name="io_setup" number="227"/> - <syscall name="io_destroy" number="228"/> - <syscall name="io_getevents" number="229"/> - <syscall name="io_submit" number="230"/> - <syscall name="io_cancel" number="231"/> - <syscall name="set_tid_address" number="232"/> - <syscall name="fadvise64" number="233"/> - <syscall name="exit_group" number="234"/> - <syscall name="lookup_dcookie" number="235"/> - <syscall name="epoll_create" number="236"/> - <syscall name="epoll_ctl" number="237"/> - <syscall name="epoll_wait" number="238"/> - <syscall name="remap_file_pages" number="239"/> - <syscall name="timer_create" number="240"/> - <syscall name="timer_settime" number="241"/> - <syscall name="timer_gettime" number="242"/> - <syscall name="timer_getoverrun" number="243"/> - <syscall name="timer_delete" number="244"/> - <syscall name="clock_settime" number="245"/> - <syscall name="clock_gettime" number="246"/> - <syscall name="clock_getres" number="247"/> - <syscall name="clock_nanosleep" number="248"/> - <syscall name="swapcontext" number="249"/> - <syscall name="tgkill" number="250"/> - <syscall name="utimes" number="251"/> - <syscall name="statfs64" number="252"/> - <syscall name="fstatfs64" number="253"/> - <syscall name="fadvise64_64" number="254"/> - <syscall name="rtas" number="255"/> - <syscall name="sys_debug_setcontext" number="256"/> - <syscall name="mbind" number="259"/> - <syscall name="get_mempolicy" number="260"/> - <syscall name="set_mempolicy" number="261"/> - <syscall name="mq_open" number="262"/> - <syscall name="mq_unlink" number="263"/> - <syscall name="mq_timedsend" number="264"/> - <syscall name="mq_timedreceive" number="265"/> - <syscall name="mq_notify" number="266"/> - <syscall name="mq_getsetattr" number="267"/> - <syscall name="kexec_load" number="268"/> - <syscall name="add_key" number="269"/> - <syscall name="request_key" number="270"/> - <syscall name="keyctl" number="271"/> - <syscall name="waitid" number="272"/> - <syscall name="ioprio_set" number="273"/> - <syscall name="ioprio_get" number="274"/> - <syscall name="inotify_init" number="275"/> - <syscall name="inotify_add_watch" number="276"/> - <syscall name="inotify_rm_watch" number="277"/> - <syscall name="spu_run" number="278"/> - <syscall name="spu_create" number="279"/> - <syscall name="pselect6" number="280"/> - <syscall name="ppoll" number="281"/> - <syscall name="unshare" number="282"/> - <syscall name="openat" number="286"/> - <syscall name="mkdirat" number="287"/> - <syscall name="mknodat" number="288"/> - <syscall name="fchownat" number="289"/> - <syscall name="futimesat" number="290"/> - <syscall name="fstatat64" number="291"/> - <syscall name="unlinkat" number="292"/> - <syscall name="renameat" number="293"/> - <syscall name="linkat" number="294"/> - <syscall name="symlinkat" number="295"/> - <syscall name="readlinkat" number="296"/> - <syscall name="fchmodat" number="297"/> - <syscall name="faccessat" number="298"/> -</syscalls_info> diff --git a/share/gdb/syscalls/ppc64-linux.xml b/share/gdb/syscalls/ppc64-linux.xml deleted file mode 100644 index ad56db1..0000000 --- a/share/gdb/syscalls/ppc64-linux.xml +++ /dev/null @@ -1,295 +0,0 @@ -<?xml version="1.0"?> -<!-- Copyright (C) 2009-2013 Free Software Foundation, Inc. - - Copying and distribution of this file, with or without modification, - are permitted in any medium without royalty provided the copyright - notice and this notice are preserved. --> - -<!DOCTYPE feature SYSTEM "gdb-syscalls.dtd"> - -<!-- This file was generated using the following file: - - /usr/src/linux/arch/powerpc/include/asm/unistd.h - - The file mentioned above belongs to the Linux Kernel. --> - -<syscalls_info> - <syscall name="restart_syscall" number="0"/> - <syscall name="exit" number="1"/> - <syscall name="fork" number="2"/> - <syscall name="read" number="3"/> - <syscall name="write" number="4"/> - <syscall name="open" number="5"/> - <syscall name="close" number="6"/> - <syscall name="waitpid" number="7"/> - <syscall name="creat" number="8"/> - <syscall name="link" number="9"/> - <syscall name="unlink" number="10"/> - <syscall name="execve" number="11"/> - <syscall name="chdir" number="12"/> - <syscall name="time" number="13"/> - <syscall name="mknod" number="14"/> - <syscall name="chmod" number="15"/> - <syscall name="lchown" number="16"/> - <syscall name="break" number="17"/> - <syscall name="oldstat" number="18"/> - <syscall name="lseek" number="19"/> - <syscall name="getpid" number="20"/> - <syscall name="mount" number="21"/> - <syscall name="umount" number="22"/> - <syscall name="setuid" number="23"/> - <syscall name="getuid" number="24"/> - <syscall name="stime" number="25"/> - <syscall name="ptrace" number="26"/> - <syscall name="alarm" number="27"/> - <syscall name="oldfstat" number="28"/> - <syscall name="pause" number="29"/> - <syscall name="utime" number="30"/> - <syscall name="stty" number="31"/> - <syscall name="gtty" number="32"/> - <syscall name="access" number="33"/> - <syscall name="nice" number="34"/> - <syscall name="ftime" number="35"/> - <syscall name="sync" number="36"/> - <syscall name="kill" number="37"/> - <syscall name="rename" number="38"/> - <syscall name="mkdir" number="39"/> - <syscall name="rmdir" number="40"/> - <syscall name="dup" number="41"/> - <syscall name="pipe" number="42"/> - <syscall name="times" number="43"/> - <syscall name="prof" number="44"/> - <syscall name="brk" number="45"/> - <syscall name="setgid" number="46"/> - <syscall name="getgid" number="47"/> - <syscall name="signal" number="48"/> - <syscall name="geteuid" number="49"/> - <syscall name="getegid" number="50"/> - <syscall name="acct" number="51"/> - <syscall name="umount2" number="52"/> - <syscall name="lock" number="53"/> - <syscall name="ioctl" number="54"/> - <syscall name="fcntl" number="55"/> - <syscall name="mpx" number="56"/> - <syscall name="setpgid" number="57"/> - <syscall name="ulimit" number="58"/> - <syscall name="oldolduname" number="59"/> - <syscall name="umask" number="60"/> - <syscall name="chroot" number="61"/> - <syscall name="ustat" number="62"/> - <syscall name="dup2" number="63"/> - <syscall name="getppid" number="64"/> - <syscall name="getpgrp" number="65"/> - <syscall name="setsid" number="66"/> - <syscall name="sigaction" number="67"/> - <syscall name="sgetmask" number="68"/> - <syscall name="ssetmask" number="69"/> - <syscall name="setreuid" number="70"/> - <syscall name="setregid" number="71"/> - <syscall name="sigsuspend" number="72"/> - <syscall name="sigpending" number="73"/> - <syscall name="sethostname" number="74"/> - <syscall name="setrlimit" number="75"/> - <syscall name="getrlimit" number="76"/> - <syscall name="getrusage" number="77"/> - <syscall name="gettimeofday" number="78"/> - <syscall name="settimeofday" number="79"/> - <syscall name="getgroups" number="80"/> - <syscall name="setgroups" number="81"/> - <syscall name="select" number="82"/> - <syscall name="symlink" number="83"/> - <syscall name="oldlstat" number="84"/> - <syscall name="readlink" number="85"/> - <syscall name="uselib" number="86"/> - <syscall name="swapon" number="87"/> - <syscall name="reboot" number="88"/> - <syscall name="readdir" number="89"/> - <syscall name="mmap" number="90"/> - <syscall name="munmap" number="91"/> - <syscall name="truncate" number="92"/> - <syscall name="ftruncate" number="93"/> - <syscall name="fchmod" number="94"/> - <syscall name="fchown" number="95"/> - <syscall name="getpriority" number="96"/> - <syscall name="setpriority" number="97"/> - <syscall name="profil" number="98"/> - <syscall name="statfs" number="99"/> - <syscall name="fstatfs" number="100"/> - <syscall name="ioperm" number="101"/> - <syscall name="socketcall" number="102"/> - <syscall name="syslog" number="103"/> - <syscall name="setitimer" number="104"/> - <syscall name="getitimer" number="105"/> - <syscall name="stat" number="106"/> - <syscall name="lstat" number="107"/> - <syscall name="fstat" number="108"/> - <syscall name="olduname" number="109"/> - <syscall name="iopl" number="110"/> - <syscall name="vhangup" number="111"/> - <syscall name="idle" number="112"/> - <syscall name="vm86" number="113"/> - <syscall name="wait4" number="114"/> - <syscall name="swapoff" number="115"/> - <syscall name="sysinfo" number="116"/> - <syscall name="ipc" number="117"/> - <syscall name="fsync" number="118"/> - <syscall name="sigreturn" number="119"/> - <syscall name="clone" number="120"/> - <syscall name="setdomainname" number="121"/> - <syscall name="uname" number="122"/> - <syscall name="modify_ldt" number="123"/> - <syscall name="adjtimex" number="124"/> - <syscall name="mprotect" number="125"/> - <syscall name="sigprocmask" number="126"/> - <syscall name="create_module" number="127"/> - <syscall name="init_module" number="128"/> - <syscall name="delete_module" number="129"/> - <syscall name="get_kernel_syms" number="130"/> - <syscall name="quotactl" number="131"/> - <syscall name="getpgid" number="132"/> - <syscall name="fchdir" number="133"/> - <syscall name="bdflush" number="134"/> - <syscall name="sysfs" number="135"/> - <syscall name="personality" number="136"/> - <syscall name="afs_syscall" number="137"/> - <syscall name="setfsuid" number="138"/> - <syscall name="setfsgid" number="139"/> - <syscall name="_llseek" number="140"/> - <syscall name="getdents" number="141"/> - <syscall name="_newselect" number="142"/> - <syscall name="flock" number="143"/> - <syscall name="msync" number="144"/> - <syscall name="readv" number="145"/> - <syscall name="writev" number="146"/> - <syscall name="getsid" number="147"/> - <syscall name="fdatasync" number="148"/> - <syscall name="_sysctl" number="149"/> - <syscall name="mlock" number="150"/> - <syscall name="munlock" number="151"/> - <syscall name="mlockall" number="152"/> - <syscall name="munlockall" number="153"/> - <syscall name="sched_setparam" number="154"/> - <syscall name="sched_getparam" number="155"/> - <syscall name="sched_setscheduler" number="156"/> - <syscall name="sched_getscheduler" number="157"/> - <syscall name="sched_yield" number="158"/> - <syscall name="sched_get_priority_max" number="159"/> - <syscall name="sched_get_priority_min" number="160"/> - <syscall name="sched_rr_get_interval" number="161"/> - <syscall name="nanosleep" number="162"/> - <syscall name="mremap" number="163"/> - <syscall name="setresuid" number="164"/> - <syscall name="getresuid" number="165"/> - <syscall name="query_module" number="166"/> - <syscall name="poll" number="167"/> - <syscall name="nfsservctl" number="168"/> - <syscall name="setresgid" number="169"/> - <syscall name="getresgid" number="170"/> - <syscall name="prctl" number="171"/> - <syscall name="rt_sigreturn" number="172"/> - <syscall name="rt_sigaction" number="173"/> - <syscall name="rt_sigprocmask" number="174"/> - <syscall name="rt_sigpending" number="175"/> - <syscall name="rt_sigtimedwait" number="176"/> - <syscall name="rt_sigqueueinfo" number="177"/> - <syscall name="rt_sigsuspend" number="178"/> - <syscall name="pread64" number="179"/> - <syscall name="pwrite64" number="180"/> - <syscall name="chown" number="181"/> - <syscall name="getcwd" number="182"/> - <syscall name="capget" number="183"/> - <syscall name="capset" number="184"/> - <syscall name="sigaltstack" number="185"/> - <syscall name="sendfile" number="186"/> - <syscall name="getpmsg" number="187"/> - <syscall name="putpmsg" number="188"/> - <syscall name="vfork" number="189"/> - <syscall name="ugetrlimit" number="190"/> - <syscall name="readahead" number="191"/> - <syscall name="pciconfig_read" number="198"/> - <syscall name="pciconfig_write" number="199"/> - <syscall name="pciconfig_iobase" number="200"/> - <syscall name="multiplexer" number="201"/> - <syscall name="getdents64" number="202"/> - <syscall name="pivot_root" number="203"/> - <syscall name="madvise" number="205"/> - <syscall name="mincore" number="206"/> - <syscall name="gettid" number="207"/> - <syscall name="tkill" number="208"/> - <syscall name="setxattr" number="209"/> - <syscall name="lsetxattr" number="210"/> - <syscall name="fsetxattr" number="211"/> - <syscall name="getxattr" number="212"/> - <syscall name="lgetxattr" number="213"/> - <syscall name="fgetxattr" number="214"/> - <syscall name="listxattr" number="215"/> - <syscall name="llistxattr" number="216"/> - <syscall name="flistxattr" number="217"/> - <syscall name="removexattr" number="218"/> - <syscall name="lremovexattr" number="219"/> - <syscall name="fremovexattr" number="220"/> - <syscall name="futex" number="221"/> - <syscall name="sched_setaffinity" number="222"/> - <syscall name="sched_getaffinity" number="223"/> - <syscall name="tuxcall" number="225"/> - <syscall name="io_setup" number="227"/> - <syscall name="io_destroy" number="228"/> - <syscall name="io_getevents" number="229"/> - <syscall name="io_submit" number="230"/> - <syscall name="io_cancel" number="231"/> - <syscall name="set_tid_address" number="232"/> - <syscall name="fadvise64" number="233"/> - <syscall name="exit_group" number="234"/> - <syscall name="lookup_dcookie" number="235"/> - <syscall name="epoll_create" number="236"/> - <syscall name="epoll_ctl" number="237"/> - <syscall name="epoll_wait" number="238"/> - <syscall name="remap_file_pages" number="239"/> - <syscall name="timer_create" number="240"/> - <syscall name="timer_settime" number="241"/> - <syscall name="timer_gettime" number="242"/> - <syscall name="timer_getoverrun" number="243"/> - <syscall name="timer_delete" number="244"/> - <syscall name="clock_settime" number="245"/> - <syscall name="clock_gettime" number="246"/> - <syscall name="clock_getres" number="247"/> - <syscall name="clock_nanosleep" number="248"/> - <syscall name="swapcontext" number="249"/> - <syscall name="tgkill" number="250"/> - <syscall name="utimes" number="251"/> - <syscall name="statfs64" number="252"/> - <syscall name="fstatfs64" number="253"/> - <syscall name="rtas" number="255"/> - <syscall name="sys_debug_setcontext" number="256"/> - <syscall name="mbind" number="259"/> - <syscall name="get_mempolicy" number="260"/> - <syscall name="set_mempolicy" number="261"/> - <syscall name="mq_open" number="262"/> - <syscall name="mq_unlink" number="263"/> - <syscall name="mq_timedsend" number="264"/> - <syscall name="mq_timedreceive" number="265"/> - <syscall name="mq_notify" number="266"/> - <syscall name="mq_getsetattr" number="267"/> - <syscall name="kexec_load" number="268"/> - <syscall name="add_key" number="269"/> - <syscall name="request_key" number="270"/> - <syscall name="keyctl" number="271"/> - <syscall name="waitid" number="272"/> - <syscall name="ioprio_set" number="273"/> - <syscall name="ioprio_get" number="274"/> - <syscall name="inotify_init" number="275"/> - <syscall name="inotify_add_watch" number="276"/> - <syscall name="inotify_rm_watch" number="277"/> - <syscall name="spu_run" number="278"/> - <syscall name="spu_create" number="279"/> - <syscall name="pselect6" number="280"/> - <syscall name="ppoll" number="281"/> - <syscall name="unshare" number="282"/> - <syscall name="unlinkat" number="286"/> - <syscall name="renameat" number="287"/> - <syscall name="linkat" number="288"/> - <syscall name="symlinkat" number="289"/> - <syscall name="readlinkat" number="290"/> - <syscall name="fchmodat" number="291"/> - <syscall name="faccessat" number="292"/> -</syscalls_info> diff --git a/share/gdb/syscalls/sparc-linux.xml b/share/gdb/syscalls/sparc-linux.xml deleted file mode 100644 index 7673621..0000000 --- a/share/gdb/syscalls/sparc-linux.xml +++ /dev/null @@ -1,344 +0,0 @@ -<?xml version="1.0"?> -<!-- Copyright (C) 2010-2013 Free Software Foundation, Inc. - - Copying and distribution of this file, with or without modification, - are permitted in any medium without royalty provided the copyright - notice and this notice are preserved. --> - -<!DOCTYPE feature SYSTEM "gdb-syscalls.dtd"> - -<!-- This file was generated using the following file: - - /usr/src/linux/arch/sparc/include/asm/unistd.h - - The file mentioned above belongs to the Linux Kernel. --> - -<syscalls_info> - <syscall name="restart_syscall" number="0"/> - <syscall name="exit" number="1"/> - <syscall name="fork" number="2"/> - <syscall name="read" number="3"/> - <syscall name="write" number="4"/> - <syscall name="open" number="5"/> - <syscall name="close" number="6"/> - <syscall name="wait4" number="7"/> - <syscall name="creat" number="8"/> - <syscall name="link" number="9"/> - <syscall name="unlink" number="10"/> - <syscall name="execv" number="11"/> - <syscall name="chdir" number="12"/> - <syscall name="chown" number="13"/> - <syscall name="mknod" number="14"/> - <syscall name="chmod" number="15"/> - <syscall name="lchown" number="16"/> - <syscall name="brk" number="17"/> - <syscall name="perfctr" number="18"/> - <syscall name="lseek" number="19"/> - <syscall name="getpid" number="20"/> - <syscall name="capget" number="21"/> - <syscall name="capset" number="22"/> - <syscall name="setuid" number="23"/> - <syscall name="getuid" number="24"/> - <syscall name="vmsplice" number="25"/> - <syscall name="ptrace" number="26"/> - <syscall name="alarm" number="27"/> - <syscall name="sigaltstack" number="28"/> - <syscall name="pause" number="29"/> - <syscall name="utime" number="30"/> - <syscall name="lchown32" number="31"/> - <syscall name="fchown32" number="32"/> - <syscall name="access" number="33"/> - <syscall name="nice" number="34"/> - <syscall name="chown32" number="35"/> - <syscall name="sync" number="36"/> - <syscall name="kill" number="37"/> - <syscall name="stat" number="38"/> - <syscall name="sendfile" number="39"/> - <syscall name="lstat" number="40"/> - <syscall name="dup" number="41"/> - <syscall name="pipe" number="42"/> - <syscall name="times" number="43"/> - <syscall name="getuid32" number="44"/> - <syscall name="umount2" number="45"/> - <syscall name="setgid" number="46"/> - <syscall name="getgid" number="47"/> - <syscall name="signal" number="48"/> - <syscall name="geteuid" number="49"/> - <syscall name="getegid" number="50"/> - <syscall name="acct" number="51"/> - <syscall name="getgid32" number="53"/> - <syscall name="ioctl" number="54"/> - <syscall name="reboot" number="55"/> - <syscall name="mmap2" number="56"/> - <syscall name="symlink" number="57"/> - <syscall name="readlink" number="58"/> - <syscall name="execve" number="59"/> - <syscall name="umask" number="60"/> - <syscall name="chroot" number="61"/> - <syscall name="fstat" number="62"/> - <syscall name="fstat64" number="63"/> - <syscall name="getpagesize" number="64"/> - <syscall name="msync" number="65"/> - <syscall name="vfork" number="66"/> - <syscall name="pread64" number="67"/> - <syscall name="pwrite64" number="68"/> - <syscall name="geteuid32" number="69"/> - <syscall name="getegid32" number="70"/> - <syscall name="mmap" number="71"/> - <syscall name="setreuid32" number="72"/> - <syscall name="munmap" number="73"/> - <syscall name="mprotect" number="74"/> - <syscall name="madvise" number="75"/> - <syscall name="vhangup" number="76"/> - <syscall name="truncate64" number="77"/> - <syscall name="mincore" number="78"/> - <syscall name="getgroups" number="79"/> - <syscall name="setgroups" number="80"/> - <syscall name="getpgrp" number="81"/> - <syscall name="setgroups32" number="82"/> - <syscall name="setitimer" number="83"/> - <syscall name="ftruncate64" number="84"/> - <syscall name="swapon" number="85"/> - <syscall name="getitimer" number="86"/> - <syscall name="setuid32" number="87"/> - <syscall name="sethostname" number="88"/> - <syscall name="setgid32" number="89"/> - <syscall name="dup2" number="90"/> - <syscall name="setfsuid32" number="91"/> - <syscall name="fcntl" number="92"/> - <syscall name="select" number="93"/> - <syscall name="setfsgid32" number="94"/> - <syscall name="fsync" number="95"/> - <syscall name="setpriority" number="96"/> - <syscall name="socket" number="97"/> - <syscall name="connect" number="98"/> - <syscall name="accept" number="99"/> - <syscall name="getpriority" number="100"/> - <syscall name="rt_sigreturn" number="101"/> - <syscall name="rt_sigaction" number="102"/> - <syscall name="rt_sigprocmask" number="103"/> - <syscall name="rt_sigpending" number="104"/> - <syscall name="rt_sigtimedwait" number="105"/> - <syscall name="rt_sigqueueinfo" number="106"/> - <syscall name="rt_sigsuspend" number="107"/> - <syscall name="setresuid32" number="108"/> - <syscall name="getresuid32" number="109"/> - <syscall name="setresgid32" number="110"/> - <syscall name="getresgid32" number="111"/> - <syscall name="setregid32" number="112"/> - <syscall name="recvmsg" number="113"/> - <syscall name="sendmsg" number="114"/> - <syscall name="getgroups32" number="115"/> - <syscall name="gettimeofday" number="116"/> - <syscall name="getrusage" number="117"/> - <syscall name="getsockopt" number="118"/> - <syscall name="getcwd" number="119"/> - <syscall name="readv" number="120"/> - <syscall name="writev" number="121"/> - <syscall name="settimeofday" number="122"/> - <syscall name="fchown" number="123"/> - <syscall name="fchmod" number="124"/> - <syscall name="recvfrom" number="125"/> - <syscall name="setreuid" number="126"/> - <syscall name="setregid" number="127"/> - <syscall name="rename" number="128"/> - <syscall name="truncate" number="129"/> - <syscall name="ftruncate" number="130"/> - <syscall name="flock" number="131"/> - <syscall name="lstat64" number="132"/> - <syscall name="sendto" number="133"/> - <syscall name="shutdown" number="134"/> - <syscall name="socketpair" number="135"/> - <syscall name="mkdir" number="136"/> - <syscall name="rmdir" number="137"/> - <syscall name="utimes" number="138"/> - <syscall name="stat64" number="139"/> - <syscall name="sendfile64" number="140"/> - <syscall name="getpeername" number="141"/> - <syscall name="futex" number="142"/> - <syscall name="gettid" number="143"/> - <syscall name="getrlimit" number="144"/> - <syscall name="setrlimit" number="145"/> - <syscall name="pivot_root" number="146"/> - <syscall name="prctl" number="147"/> - <syscall name="pciconfig_read" number="148"/> - <syscall name="pciconfig_write" number="149"/> - <syscall name="getsockname" number="150"/> - <syscall name="inotify_init" number="151"/> - <syscall name="inotify_add_watch" number="152"/> - <syscall name="poll" number="153"/> - <syscall name="getdents64" number="154"/> - <syscall name="fcntl64" number="155"/> - <syscall name="inotify_rm_watch" number="156"/> - <syscall name="statfs" number="157"/> - <syscall name="fstatfs" number="158"/> - <syscall name="umount" number="159"/> - <syscall name="sched_set_affinity" number="160"/> - <syscall name="sched_get_affinity" number="161"/> - <syscall name="getdomainname" number="162"/> - <syscall name="setdomainname" number="163"/> - <syscall name="quotactl" number="165"/> - <syscall name="set_tid_address" number="166"/> - <syscall name="mount" number="167"/> - <syscall name="ustat" number="168"/> - <syscall name="setxattr" number="169"/> - <syscall name="lsetxattr" number="170"/> - <syscall name="fsetxattr" number="171"/> - <syscall name="getxattr" number="172"/> - <syscall name="lgetxattr" number="173"/> - <syscall name="getdents" number="174"/> - <syscall name="setsid" number="175"/> - <syscall name="fchdir" number="176"/> - <syscall name="fgetxattr" number="177"/> - <syscall name="listxattr" number="178"/> - <syscall name="llistxattr" number="179"/> - <syscall name="flistxattr" number="180"/> - <syscall name="removexattr" number="181"/> - <syscall name="lremovexattr" number="182"/> - <syscall name="sigpending" number="183"/> - <syscall name="query_module" number="184"/> - <syscall name="setpgid" number="185"/> - <syscall name="fremovexattr" number="186"/> - <syscall name="tkill" number="187"/> - <syscall name="exit_group" number="188"/> - <syscall name="uname" number="189"/> - <syscall name="init_module" number="190"/> - <syscall name="personality" number="191"/> - <syscall name="remap_file_pages" number="192"/> - <syscall name="epoll_create" number="193"/> - <syscall name="epoll_ctl" number="194"/> - <syscall name="epoll_wait" number="195"/> - <syscall name="ioprio_set" number="196"/> - <syscall name="getppid" number="197"/> - <syscall name="sigaction" number="198"/> - <syscall name="sgetmask" number="199"/> - <syscall name="ssetmask" number="200"/> - <syscall name="sigsuspend" number="201"/> - <syscall name="oldlstat" number="202"/> - <syscall name="uselib" number="203"/> - <syscall name="readdir" number="204"/> - <syscall name="readahead" number="205"/> - <syscall name="socketcall" number="206"/> - <syscall name="syslog" number="207"/> - <syscall name="lookup_dcookie" number="208"/> - <syscall name="fadvise64" number="209"/> - <syscall name="fadvise64_64" number="210"/> - <syscall name="tgkill" number="211"/> - <syscall name="waitpid" number="212"/> - <syscall name="swapoff" number="213"/> - <syscall name="sysinfo" number="214"/> - <syscall name="ipc" number="215"/> - <syscall name="sigreturn" number="216"/> - <syscall name="clone" number="217"/> - <syscall name="ioprio_get" number="218"/> - <syscall name="adjtimex" number="219"/> - <syscall name="sigprocmask" number="220"/> - <syscall name="create_module" number="221"/> - <syscall name="delete_module" number="222"/> - <syscall name="get_kernel_syms" number="223"/> - <syscall name="getpgid" number="224"/> - <syscall name="bdflush" number="225"/> - <syscall name="sysfs" number="226"/> - <syscall name="afs_syscall" number="227"/> - <syscall name="setfsuid" number="228"/> - <syscall name="setfsgid" number="229"/> - <syscall name="_newselect" number="230"/> - <syscall name="time" number="231"/> - <syscall name="splice" number="232"/> - <syscall name="stime" number="233"/> - <syscall name="statfs64" number="234"/> - <syscall name="fstatfs64" number="235"/> - <syscall name="_llseek" number="236"/> - <syscall name="mlock" number="237"/> - <syscall name="munlock" number="238"/> - <syscall name="mlockall" number="239"/> - <syscall name="munlockall" number="240"/> - <syscall name="sched_setparam" number="241"/> - <syscall name="sched_getparam" number="242"/> - <syscall name="sched_setscheduler" number="243"/> - <syscall name="sched_getscheduler" number="244"/> - <syscall name="sched_yield" number="245"/> - <syscall name="sched_get_priority_max" number="246"/> - <syscall name="sched_get_priority_min" number="247"/> - <syscall name="sched_rr_get_interval" number="248"/> - <syscall name="nanosleep" number="249"/> - <syscall name="mremap" number="250"/> - <syscall name="_sysctl" number="251"/> - <syscall name="getsid" number="252"/> - <syscall name="fdatasync" number="253"/> - <syscall name="nfsservctl" number="254"/> - <syscall name="sync_file_range" number="255"/> - <syscall name="clock_settime" number="256"/> - <syscall name="clock_gettime" number="257"/> - <syscall name="clock_getres" number="258"/> - <syscall name="clock_nanosleep" number="259"/> - <syscall name="sched_getaffinity" number="260"/> - <syscall name="sched_setaffinity" number="261"/> - <syscall name="timer_settime" number="262"/> - <syscall name="timer_gettime" number="263"/> - <syscall name="timer_getoverrun" number="264"/> - <syscall name="timer_delete" number="265"/> - <syscall name="timer_create" number="266"/> - <syscall name="vserver" number="267"/> - <syscall name="io_setup" number="268"/> - <syscall name="io_destroy" number="269"/> - <syscall name="io_submit" number="270"/> - <syscall name="io_cancel" number="271"/> - <syscall name="io_getevents" number="272"/> - <syscall name="mq_open" number="273"/> - <syscall name="mq_unlink" number="274"/> - <syscall name="mq_timedsend" number="275"/> - <syscall name="mq_timedreceive" number="276"/> - <syscall name="mq_notify" number="277"/> - <syscall name="mq_getsetattr" number="278"/> - <syscall name="waitid" number="279"/> - <syscall name="tee" number="280"/> - <syscall name="add_key" number="281"/> - <syscall name="request_key" number="282"/> - <syscall name="keyctl" number="283"/> - <syscall name="openat" number="284"/> - <syscall name="mkdirat" number="285"/> - <syscall name="mknodat" number="286"/> - <syscall name="fchownat" number="287"/> - <syscall name="futimesat" number="288"/> - <syscall name="fstatat64" number="289"/> - <syscall name="unlinkat" number="290"/> - <syscall name="renameat" number="291"/> - <syscall name="linkat" number="292"/> - <syscall name="symlinkat" number="293"/> - <syscall name="readlinkat" number="294"/> - <syscall name="fchmodat" number="295"/> - <syscall name="faccessat" number="296"/> - <syscall name="pselect6" number="297"/> - <syscall name="ppoll" number="298"/> - <syscall name="unshare" number="299"/> - <syscall name="set_robust_list" number="300"/> - <syscall name="get_robust_list" number="301"/> - <syscall name="migrate_pages" number="302"/> - <syscall name="mbind" number="303"/> - <syscall name="get_mempolicy" number="304"/> - <syscall name="set_mempolicy" number="305"/> - <syscall name="kexec_load" number="306"/> - <syscall name="move_pages" number="307"/> - <syscall name="getcpu" number="308"/> - <syscall name="epoll_pwait" number="309"/> - <syscall name="utimensat" number="310"/> - <syscall name="signalfd" number="311"/> - <syscall name="timerfd_create" number="312"/> - <syscall name="eventfd" number="313"/> - <syscall name="fallocate" number="314"/> - <syscall name="timerfd_settime" number="315"/> - <syscall name="timerfd_gettime" number="316"/> - <syscall name="signalfd4" number="317"/> - <syscall name="eventfd2" number="318"/> - <syscall name="epoll_create1" number="319"/> - <syscall name="dup3" number="320"/> - <syscall name="pipe2" number="321"/> - <syscall name="inotify_init1" number="322"/> - <syscall name="accept4" number="323"/> - <syscall name="preadv" number="324"/> - <syscall name="pwritev" number="325"/> - <syscall name="rt_tgsigqueueinfo" number="326"/> - <syscall name="perf_event_open" number="327"/> - <syscall name="recvmmsg" number="328"/> -</syscalls_info> diff --git a/share/gdb/syscalls/sparc64-linux.xml b/share/gdb/syscalls/sparc64-linux.xml deleted file mode 100644 index 4403ca3..0000000 --- a/share/gdb/syscalls/sparc64-linux.xml +++ /dev/null @@ -1,326 +0,0 @@ -<?xml version="1.0"?> -<!-- Copyright (C) 2010-2013 Free Software Foundation, Inc. - - Copying and distribution of this file, with or without modification, - are permitted in any medium without royalty provided the copyright - notice and this notice are preserved. --> - -<!DOCTYPE feature SYSTEM "gdb-syscalls.dtd"> - -<!-- This file was generated using the following file: - - /usr/src/linux/arch/sparc/include/asm/unistd.h - - The file mentioned above belongs to the Linux Kernel. --> - -<syscalls_info> - <syscall name="restart_syscall" number="0"/> - <syscall name="exit" number="1"/> - <syscall name="fork" number="2"/> - <syscall name="read" number="3"/> - <syscall name="write" number="4"/> - <syscall name="open" number="5"/> - <syscall name="close" number="6"/> - <syscall name="wait4" number="7"/> - <syscall name="creat" number="8"/> - <syscall name="link" number="9"/> - <syscall name="unlink" number="10"/> - <syscall name="execv" number="11"/> - <syscall name="chdir" number="12"/> - <syscall name="chown" number="13"/> - <syscall name="mknod" number="14"/> - <syscall name="chmod" number="15"/> - <syscall name="lchown" number="16"/> - <syscall name="brk" number="17"/> - <syscall name="perfctr" number="18"/> - <syscall name="lseek" number="19"/> - <syscall name="getpid" number="20"/> - <syscall name="capget" number="21"/> - <syscall name="capset" number="22"/> - <syscall name="setuid" number="23"/> - <syscall name="getuid" number="24"/> - <syscall name="vmsplice" number="25"/> - <syscall name="ptrace" number="26"/> - <syscall name="alarm" number="27"/> - <syscall name="sigaltstack" number="28"/> - <syscall name="pause" number="29"/> - <syscall name="utime" number="30"/> - <syscall name="access" number="33"/> - <syscall name="nice" number="34"/> - <syscall name="sync" number="36"/> - <syscall name="kill" number="37"/> - <syscall name="stat" number="38"/> - <syscall name="sendfile" number="39"/> - <syscall name="lstat" number="40"/> - <syscall name="dup" number="41"/> - <syscall name="pipe" number="42"/> - <syscall name="times" number="43"/> - <syscall name="umount2" number="45"/> - <syscall name="setgid" number="46"/> - <syscall name="getgid" number="47"/> - <syscall name="signal" number="48"/> - <syscall name="geteuid" number="49"/> - <syscall name="getegid" number="50"/> - <syscall name="acct" number="51"/> - <syscall name="memory_ordering" number="52"/> - <syscall name="ioctl" number="54"/> - <syscall name="reboot" number="55"/> - <syscall name="symlink" number="57"/> - <syscall name="readlink" number="58"/> - <syscall name="execve" number="59"/> - <syscall name="umask" number="60"/> - <syscall name="chroot" number="61"/> - <syscall name="fstat" number="62"/> - <syscall name="fstat64" number="63"/> - <syscall name="getpagesize" number="64"/> - <syscall name="msync" number="65"/> - <syscall name="vfork" number="66"/> - <syscall name="pread64" number="67"/> - <syscall name="pwrite64" number="68"/> - <syscall name="mmap" number="71"/> - <syscall name="munmap" number="73"/> - <syscall name="mprotect" number="74"/> - <syscall name="madvise" number="75"/> - <syscall name="vhangup" number="76"/> - <syscall name="mincore" number="78"/> - <syscall name="getgroups" number="79"/> - <syscall name="setgroups" number="80"/> - <syscall name="getpgrp" number="81"/> - <syscall name="setitimer" number="83"/> - <syscall name="swapon" number="85"/> - <syscall name="getitimer" number="86"/> - <syscall name="sethostname" number="88"/> - <syscall name="dup2" number="90"/> - <syscall name="fcntl" number="92"/> - <syscall name="select" number="93"/> - <syscall name="fsync" number="95"/> - <syscall name="setpriority" number="96"/> - <syscall name="socket" number="97"/> - <syscall name="connect" number="98"/> - <syscall name="accept" number="99"/> - <syscall name="getpriority" number="100"/> - <syscall name="rt_sigreturn" number="101"/> - <syscall name="rt_sigaction" number="102"/> - <syscall name="rt_sigprocmask" number="103"/> - <syscall name="rt_sigpending" number="104"/> - <syscall name="rt_sigtimedwait" number="105"/> - <syscall name="rt_sigqueueinfo" number="106"/> - <syscall name="rt_sigsuspend" number="107"/> - <syscall name="setresuid" number="108"/> - <syscall name="getresuid" number="109"/> - <syscall name="setresgid" number="110"/> - <syscall name="getresgid" number="111"/> - <syscall name="recvmsg" number="113"/> - <syscall name="sendmsg" number="114"/> - <syscall name="gettimeofday" number="116"/> - <syscall name="getrusage" number="117"/> - <syscall name="getsockopt" number="118"/> - <syscall name="getcwd" number="119"/> - <syscall name="readv" number="120"/> - <syscall name="writev" number="121"/> - <syscall name="settimeofday" number="122"/> - <syscall name="fchown" number="123"/> - <syscall name="fchmod" number="124"/> - <syscall name="recvfrom" number="125"/> - <syscall name="setreuid" number="126"/> - <syscall name="setregid" number="127"/> - <syscall name="rename" number="128"/> - <syscall name="truncate" number="129"/> - <syscall name="ftruncate" number="130"/> - <syscall name="flock" number="131"/> - <syscall name="lstat64" number="132"/> - <syscall name="sendto" number="133"/> - <syscall name="shutdown" number="134"/> - <syscall name="socketpair" number="135"/> - <syscall name="mkdir" number="136"/> - <syscall name="rmdir" number="137"/> - <syscall name="utimes" number="138"/> - <syscall name="stat64" number="139"/> - <syscall name="sendfile64" number="140"/> - <syscall name="getpeername" number="141"/> - <syscall name="futex" number="142"/> - <syscall name="gettid" number="143"/> - <syscall name="getrlimit" number="144"/> - <syscall name="setrlimit" number="145"/> - <syscall name="pivot_root" number="146"/> - <syscall name="prctl" number="147"/> - <syscall name="pciconfig_read" number="148"/> - <syscall name="pciconfig_write" number="149"/> - <syscall name="getsockname" number="150"/> - <syscall name="inotify_init" number="151"/> - <syscall name="inotify_add_watch" number="152"/> - <syscall name="poll" number="153"/> - <syscall name="getdents64" number="154"/> - <syscall name="inotify_rm_watch" number="156"/> - <syscall name="statfs" number="157"/> - <syscall name="fstatfs" number="158"/> - <syscall name="umount" number="159"/> - <syscall name="sched_set_affinity" number="160"/> - <syscall name="sched_get_affinity" number="161"/> - <syscall name="getdomainname" number="162"/> - <syscall name="setdomainname" number="163"/> - <syscall name="utrap_install" number="164"/> - <syscall name="quotactl" number="165"/> - <syscall name="set_tid_address" number="166"/> - <syscall name="mount" number="167"/> - <syscall name="ustat" number="168"/> - <syscall name="setxattr" number="169"/> - <syscall name="lsetxattr" number="170"/> - <syscall name="fsetxattr" number="171"/> - <syscall name="getxattr" number="172"/> - <syscall name="lgetxattr" number="173"/> - <syscall name="getdents" number="174"/> - <syscall name="setsid" number="175"/> - <syscall name="fchdir" number="176"/> - <syscall name="fgetxattr" number="177"/> - <syscall name="listxattr" number="178"/> - <syscall name="llistxattr" number="179"/> - <syscall name="flistxattr" number="180"/> - <syscall name="removexattr" number="181"/> - <syscall name="lremovexattr" number="182"/> - <syscall name="sigpending" number="183"/> - <syscall name="query_module" number="184"/> - <syscall name="setpgid" number="185"/> - <syscall name="fremovexattr" number="186"/> - <syscall name="tkill" number="187"/> - <syscall name="exit_group" number="188"/> - <syscall name="uname" number="189"/> - <syscall name="init_module" number="190"/> - <syscall name="personality" number="191"/> - <syscall name="remap_file_pages" number="192"/> - <syscall name="epoll_create" number="193"/> - <syscall name="epoll_ctl" number="194"/> - <syscall name="epoll_wait" number="195"/> - <syscall name="ioprio_set" number="196"/> - <syscall name="getppid" number="197"/> - <syscall name="sigaction" number="198"/> - <syscall name="sgetmask" number="199"/> - <syscall name="ssetmask" number="200"/> - <syscall name="sigsuspend" number="201"/> - <syscall name="oldlstat" number="202"/> - <syscall name="uselib" number="203"/> - <syscall name="readdir" number="204"/> - <syscall name="readahead" number="205"/> - <syscall name="socketcall" number="206"/> - <syscall name="syslog" number="207"/> - <syscall name="lookup_dcookie" number="208"/> - <syscall name="fadvise64" number="209"/> - <syscall name="fadvise64_64" number="210"/> - <syscall name="tgkill" number="211"/> - <syscall name="waitpid" number="212"/> - <syscall name="swapoff" number="213"/> - <syscall name="sysinfo" number="214"/> - <syscall name="ipc" number="215"/> - <syscall name="sigreturn" number="216"/> - <syscall name="clone" number="217"/> - <syscall name="ioprio_get" number="218"/> - <syscall name="adjtimex" number="219"/> - <syscall name="sigprocmask" number="220"/> - <syscall name="create_module" number="221"/> - <syscall name="delete_module" number="222"/> - <syscall name="get_kernel_syms" number="223"/> - <syscall name="getpgid" number="224"/> - <syscall name="bdflush" number="225"/> - <syscall name="sysfs" number="226"/> - <syscall name="afs_syscall" number="227"/> - <syscall name="setfsuid" number="228"/> - <syscall name="setfsgid" number="229"/> - <syscall name="_newselect" number="230"/> - <syscall name="splice" number="232"/> - <syscall name="stime" number="233"/> - <syscall name="statfs64" number="234"/> - <syscall name="fstatfs64" number="235"/> - <syscall name="_llseek" number="236"/> - <syscall name="mlock" number="237"/> - <syscall name="munlock" number="238"/> - <syscall name="mlockall" number="239"/> - <syscall name="munlockall" number="240"/> - <syscall name="sched_setparam" number="241"/> - <syscall name="sched_getparam" number="242"/> - <syscall name="sched_setscheduler" number="243"/> - <syscall name="sched_getscheduler" number="244"/> - <syscall name="sched_yield" number="245"/> - <syscall name="sched_get_priority_max" number="246"/> - <syscall name="sched_get_priority_min" number="247"/> - <syscall name="sched_rr_get_interval" number="248"/> - <syscall name="nanosleep" number="249"/> - <syscall name="mremap" number="250"/> - <syscall name="_sysctl" number="251"/> - <syscall name="getsid" number="252"/> - <syscall name="fdatasync" number="253"/> - <syscall name="nfsservctl" number="254"/> - <syscall name="sync_file_range" number="255"/> - <syscall name="clock_settime" number="256"/> - <syscall name="clock_gettime" number="257"/> - <syscall name="clock_getres" number="258"/> - <syscall name="clock_nanosleep" number="259"/> - <syscall name="sched_getaffinity" number="260"/> - <syscall name="sched_setaffinity" number="261"/> - <syscall name="timer_settime" number="262"/> - <syscall name="timer_gettime" number="263"/> - <syscall name="timer_getoverrun" number="264"/> - <syscall name="timer_delete" number="265"/> - <syscall name="timer_create" number="266"/> - <syscall name="vserver" number="267"/> - <syscall name="io_setup" number="268"/> - <syscall name="io_destroy" number="269"/> - <syscall name="io_submit" number="270"/> - <syscall name="io_cancel" number="271"/> - <syscall name="io_getevents" number="272"/> - <syscall name="mq_open" number="273"/> - <syscall name="mq_unlink" number="274"/> - <syscall name="mq_timedsend" number="275"/> - <syscall name="mq_timedreceive" number="276"/> - <syscall name="mq_notify" number="277"/> - <syscall name="mq_getsetattr" number="278"/> - <syscall name="waitid" number="279"/> - <syscall name="tee" number="280"/> - <syscall name="add_key" number="281"/> - <syscall name="request_key" number="282"/> - <syscall name="keyctl" number="283"/> - <syscall name="openat" number="284"/> - <syscall name="mkdirat" number="285"/> - <syscall name="mknodat" number="286"/> - <syscall name="fchownat" number="287"/> - <syscall name="futimesat" number="288"/> - <syscall name="fstatat64" number="289"/> - <syscall name="unlinkat" number="290"/> - <syscall name="renameat" number="291"/> - <syscall name="linkat" number="292"/> - <syscall name="symlinkat" number="293"/> - <syscall name="readlinkat" number="294"/> - <syscall name="fchmodat" number="295"/> - <syscall name="faccessat" number="296"/> - <syscall name="pselect6" number="297"/> - <syscall name="ppoll" number="298"/> - <syscall name="unshare" number="299"/> - <syscall name="set_robust_list" number="300"/> - <syscall name="get_robust_list" number="301"/> - <syscall name="migrate_pages" number="302"/> - <syscall name="mbind" number="303"/> - <syscall name="get_mempolicy" number="304"/> - <syscall name="set_mempolicy" number="305"/> - <syscall name="kexec_load" number="306"/> - <syscall name="move_pages" number="307"/> - <syscall name="getcpu" number="308"/> - <syscall name="epoll_pwait" number="309"/> - <syscall name="utimensat" number="310"/> - <syscall name="signalfd" number="311"/> - <syscall name="timerfd_create" number="312"/> - <syscall name="eventfd" number="313"/> - <syscall name="fallocate" number="314"/> - <syscall name="timerfd_settime" number="315"/> - <syscall name="timerfd_gettime" number="316"/> - <syscall name="signalfd4" number="317"/> - <syscall name="eventfd2" number="318"/> - <syscall name="epoll_create1" number="319"/> - <syscall name="dup3" number="320"/> - <syscall name="pipe2" number="321"/> - <syscall name="inotify_init1" number="322"/> - <syscall name="accept4" number="323"/> - <syscall name="preadv" number="324"/> - <syscall name="pwritev" number="325"/> - <syscall name="rt_tgsigqueueinfo" number="326"/> - <syscall name="perf_event_open" number="327"/> - <syscall name="recvmmsg" number="328"/> -</syscalls_info> diff --git a/x86_64-linux-android/bin/ar b/x86_64-linux-android/bin/ar deleted file mode 120000 index 96f839e..0000000 --- a/x86_64-linux-android/bin/ar +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-ar
\ No newline at end of file diff --git a/x86_64-linux-android/bin/as b/x86_64-linux-android/bin/as deleted file mode 120000 index 22c6e67..0000000 --- a/x86_64-linux-android/bin/as +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-as
\ No newline at end of file diff --git a/x86_64-linux-android/bin/c++ b/x86_64-linux-android/bin/c++ deleted file mode 120000 index ff855a9..0000000 --- a/x86_64-linux-android/bin/c++ +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-c++
\ No newline at end of file diff --git a/x86_64-linux-android/bin/g++ b/x86_64-linux-android/bin/g++ deleted file mode 120000 index 33306a9..0000000 --- a/x86_64-linux-android/bin/g++ +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-g++
\ No newline at end of file diff --git a/x86_64-linux-android/bin/gcc b/x86_64-linux-android/bin/gcc deleted file mode 120000 index 3b4257e..0000000 --- a/x86_64-linux-android/bin/gcc +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-gcc
\ No newline at end of file diff --git a/x86_64-linux-android/bin/ld b/x86_64-linux-android/bin/ld deleted file mode 120000 index 32f02e0..0000000 --- a/x86_64-linux-android/bin/ld +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-ld
\ No newline at end of file diff --git a/x86_64-linux-android/bin/ld.bfd b/x86_64-linux-android/bin/ld.bfd deleted file mode 120000 index 78dec82..0000000 --- a/x86_64-linux-android/bin/ld.bfd +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-ld.bfd
\ No newline at end of file diff --git a/x86_64-linux-android/bin/ld.gold b/x86_64-linux-android/bin/ld.gold deleted file mode 120000 index cc9a1e9..0000000 --- a/x86_64-linux-android/bin/ld.gold +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-ld.gold
\ No newline at end of file diff --git a/x86_64-linux-android/bin/ld.mcld b/x86_64-linux-android/bin/ld.mcld deleted file mode 120000 index e44008b..0000000 --- a/x86_64-linux-android/bin/ld.mcld +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-ld.mcld
\ No newline at end of file diff --git a/x86_64-linux-android/bin/nm b/x86_64-linux-android/bin/nm deleted file mode 120000 index 2e87749..0000000 --- a/x86_64-linux-android/bin/nm +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-nm
\ No newline at end of file diff --git a/x86_64-linux-android/bin/objcopy b/x86_64-linux-android/bin/objcopy deleted file mode 120000 index 22d7916..0000000 --- a/x86_64-linux-android/bin/objcopy +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-objcopy
\ No newline at end of file diff --git a/x86_64-linux-android/bin/objdump b/x86_64-linux-android/bin/objdump deleted file mode 120000 index 150ba55..0000000 --- a/x86_64-linux-android/bin/objdump +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-objdump
\ No newline at end of file diff --git a/x86_64-linux-android/bin/ranlib b/x86_64-linux-android/bin/ranlib deleted file mode 120000 index 13f26b2..0000000 --- a/x86_64-linux-android/bin/ranlib +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-ranlib
\ No newline at end of file diff --git a/x86_64-linux-android/bin/strip b/x86_64-linux-android/bin/strip deleted file mode 120000 index 7404ebb..0000000 --- a/x86_64-linux-android/bin/strip +++ /dev/null @@ -1 +0,0 @@ -../../bin/x86_64-linux-android-strip
\ No newline at end of file diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.x b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.x deleted file mode 100644 index 0ad5063..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.x +++ /dev/null @@ -1,227 +0,0 @@ -/* Default linker script, for normal executables */ -/* Modified for Android. */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.ifunc : { *(.rela.ifunc) } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 32 / 8 : 1); - } - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xbn b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xbn deleted file mode 100644 index fd61792..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xbn +++ /dev/null @@ -1,224 +0,0 @@ -/* Script for -N: mix text and data on same page; don't align data */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.ifunc : { *(.rela.ifunc) } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = .; - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 32 / 8 : 1); - } - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xc b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xc deleted file mode 100644 index 96f8ad3..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xc +++ /dev/null @@ -1,228 +0,0 @@ -/* Script for -z combreloc: combine and sort reloc sections */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 32 / 8 : 1); - } - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xd b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xd deleted file mode 100644 index eeecf60..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xd +++ /dev/null @@ -1,226 +0,0 @@ -/* Script for ld -pie: link position independent executable */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.ifunc : { *(.rela.ifunc) } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 32 / 8 : 1); - } - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xdc b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xdc deleted file mode 100644 index a19b462..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xdc +++ /dev/null @@ -1,228 +0,0 @@ -/* Script for -pie -z combreloc: position independent executable, combine & sort relocs */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 32 / 8 : 1); - } - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xdw b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xdw deleted file mode 100644 index ee71093..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xdw +++ /dev/null @@ -1,227 +0,0 @@ -/* Script for -pie -z combreloc -z now -z relro: position independent executable, combine & sort relocs */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 32 / 8 : 1); - } - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xn b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xn deleted file mode 100644 index 7580acb..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xn +++ /dev/null @@ -1,226 +0,0 @@ -/* Script for -n: mix text and data on same page */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.ifunc : { *(.rela.ifunc) } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 32 / 8 : 1); - } - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xr b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xr deleted file mode 100644 index ca82fef..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xr +++ /dev/null @@ -1,153 +0,0 @@ -/* Script for ld -r: link without relocation */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) - /* For some reason, the Solaris linker makes bad executables - if gld -r is used and the intermediate file has sections starting - at non-zero addresses. Could be a Solaris ld bug, could be a GNU ld - bug. But for now assigning the zero vmas works. */ -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - .interp 0 : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash 0 : { *(.hash) } - .gnu.hash 0 : { *(.gnu.hash) } - .dynsym 0 : { *(.dynsym) } - .dynstr 0 : { *(.dynstr) } - .gnu.version 0 : { *(.gnu.version) } - .gnu.version_d 0: { *(.gnu.version_d) } - .gnu.version_r 0: { *(.gnu.version_r) } - .rela.init 0 : { *(.rela.init) } - .rela.text 0 : { *(.rela.text) } - .rela.fini 0 : { *(.rela.fini) } - .rela.rodata 0 : { *(.rela.rodata) } - .rela.data.rel.ro 0 : { *(.rela.data.rel.ro) } - .rela.data 0 : { *(.rela.data) } - .rela.tdata 0 : { *(.rela.tdata) } - .rela.tbss 0 : { *(.rela.tbss) } - .rela.ctors 0 : { *(.rela.ctors) } - .rela.dtors 0 : { *(.rela.dtors) } - .rela.got 0 : { *(.rela.got) } - .rela.bss 0 : { *(.rela.bss) } - .rela.ldata 0 : { *(.rela.ldata) } - .rela.lbss 0 : { *(.rela.lbss) } - .rela.lrodata 0 : { *(.rela.lrodata) } - .rela.ifunc 0 : { *(.rela.ifunc) } - .rela.plt 0 : - { - *(.rela.plt) - } - .init 0 : - { - KEEP (*(SORT_NONE(.init))) - } - .plt 0 : { *(.plt) *(.iplt) } - .text 0 : - { - *(.text .stub) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini 0 : - { - KEEP (*(SORT_NONE(.fini))) - } - .rodata 0 : { *(.rodata) } - .rodata1 0 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame 0 : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges 0 : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - /* Exception handling */ - .eh_frame 0 : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges 0 : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata 0 : { *(.tdata) } - .tbss 0 : { *(.tbss) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - .preinit_array 0 : - { - KEEP (*(.preinit_array)) - } - .jcr 0 : { KEEP (*(.jcr)) } - .dynamic 0 : { *(.dynamic) } - .got 0 : { *(.got) *(.igot) } - .got.plt 0 : { *(.got.plt) *(.igot.plt) } - .data 0 : - { - *(.data) - } - .data1 0 : { *(.data1) } - .bss 0 : - { - *(.dynbss) - *(.bss) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - } - .lbss 0 : - { - *(.dynlbss) - *(.lbss) - *(LARGE_COMMON) - } - .lrodata 0 : - { - *(.lrodata) - } - .ldata 0 : - { - *(.ldata) - } - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xs b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xs deleted file mode 100644 index 54cb13b..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xs +++ /dev/null @@ -1,217 +0,0 @@ -/* Script for ld --shared: link shared library */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.ifunc : { *(.rela.ifunc) } - .rela.plt : - { - *(.rela.plt) - *(.rela.iplt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 32 / 8 : 1); - } - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xsc b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xsc deleted file mode 100644 index 521515d..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xsc +++ /dev/null @@ -1,220 +0,0 @@ -/* Script for --shared -z combreloc: shared library, combine & sort relocs */ -/* Modified for Android. */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - *(.rela.iplt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 32 / 8 : 1); - } - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xsw b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xsw deleted file mode 100644 index aada16e..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xsw +++ /dev/null @@ -1,218 +0,0 @@ -/* Script for --shared -z combreloc -z now -z relro: shared library, combine & sort relocs */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - *(.rela.iplt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 32 / 8 : 1); - } - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xu b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xu deleted file mode 100644 index 881dafd..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xu +++ /dev/null @@ -1,154 +0,0 @@ -/* Script for ld -Ur: link w/out relocation, do create constructors */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) - /* For some reason, the Solaris linker makes bad executables - if gld -r is used and the intermediate file has sections starting - at non-zero addresses. Could be a Solaris ld bug, could be a GNU ld - bug. But for now assigning the zero vmas works. */ -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - .interp 0 : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash 0 : { *(.hash) } - .gnu.hash 0 : { *(.gnu.hash) } - .dynsym 0 : { *(.dynsym) } - .dynstr 0 : { *(.dynstr) } - .gnu.version 0 : { *(.gnu.version) } - .gnu.version_d 0: { *(.gnu.version_d) } - .gnu.version_r 0: { *(.gnu.version_r) } - .rela.init 0 : { *(.rela.init) } - .rela.text 0 : { *(.rela.text) } - .rela.fini 0 : { *(.rela.fini) } - .rela.rodata 0 : { *(.rela.rodata) } - .rela.data.rel.ro 0 : { *(.rela.data.rel.ro) } - .rela.data 0 : { *(.rela.data) } - .rela.tdata 0 : { *(.rela.tdata) } - .rela.tbss 0 : { *(.rela.tbss) } - .rela.ctors 0 : { *(.rela.ctors) } - .rela.dtors 0 : { *(.rela.dtors) } - .rela.got 0 : { *(.rela.got) } - .rela.bss 0 : { *(.rela.bss) } - .rela.ldata 0 : { *(.rela.ldata) } - .rela.lbss 0 : { *(.rela.lbss) } - .rela.lrodata 0 : { *(.rela.lrodata) } - .rela.ifunc 0 : { *(.rela.ifunc) } - .rela.plt 0 : - { - *(.rela.plt) - } - .init 0 : - { - KEEP (*(SORT_NONE(.init))) - } - .plt 0 : { *(.plt) *(.iplt) } - .text 0 : - { - *(.text .stub) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini 0 : - { - KEEP (*(SORT_NONE(.fini))) - } - .rodata 0 : { *(.rodata) } - .rodata1 0 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame 0 : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges 0 : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - /* Exception handling */ - .eh_frame 0 : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges 0 : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata 0 : { *(.tdata) } - .tbss 0 : { *(.tbss) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - .preinit_array 0 : - { - KEEP (*(.preinit_array)) - } - .jcr 0 : { KEEP (*(.jcr)) } - .dynamic 0 : { *(.dynamic) } - .got 0 : { *(.got) *(.igot) } - .got.plt 0 : { *(.got.plt) *(.igot.plt) } - .data 0 : - { - *(.data) - SORT(CONSTRUCTORS) - } - .data1 0 : { *(.data1) } - .bss 0 : - { - *(.dynbss) - *(.bss) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - } - .lbss 0 : - { - *(.dynlbss) - *(.lbss) - *(LARGE_COMMON) - } - .lrodata 0 : - { - *(.lrodata) - } - .ldata 0 : - { - *(.ldata) - } - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xw b/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xw deleted file mode 100644 index a3119b0..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf32_x86_64.xw +++ /dev/null @@ -1,227 +0,0 @@ -/* Script for -z combreloc -z now -z relro: combine and sort reloc sections */ -OUTPUT_FORMAT("elf32-x86-64", "elf32-x86-64", - "elf32-x86-64") -OUTPUT_ARCH(i386:x64-32) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 32 / 8 : 1); - } - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.x b/x86_64-linux-android/lib/ldscripts/elf_i386.x deleted file mode 100644 index 75807ad..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.x +++ /dev/null @@ -1,209 +0,0 @@ -/* Default linker script, for normal executables */ -/* Modified for Android. */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x08048000); . = 0x08048000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rel.init : { *(.rel.init) } - .rel.text : { *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*) } - .rel.fini : { *(.rel.fini) } - .rel.rodata : { *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*) } - .rel.data.rel.ro : { *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*) } - .rel.data : { *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*) } - .rel.tdata : { *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*) } - .rel.tbss : { *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*) } - .rel.ctors : { *(.rel.ctors) } - .rel.dtors : { *(.rel.dtors) } - .rel.got : { *(.rel.got) } - .rel.bss : { *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*) } - .rel.ifunc : { *(.rel.ifunc) } - .rel.plt : - { - *(.rel.plt) - PROVIDE_HIDDEN (__rel_iplt_start = .); - *(.rel.iplt) - PROVIDE_HIDDEN (__rel_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 12 ? 12 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xbn b/x86_64-linux-android/lib/ldscripts/elf_i386.xbn deleted file mode 100644 index 137fdd2..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xbn +++ /dev/null @@ -1,206 +0,0 @@ -/* Script for -N: mix text and data on same page; don't align data */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x08048000); . = 0x08048000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rel.init : { *(.rel.init) } - .rel.text : { *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*) } - .rel.fini : { *(.rel.fini) } - .rel.rodata : { *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*) } - .rel.data.rel.ro : { *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*) } - .rel.data : { *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*) } - .rel.tdata : { *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*) } - .rel.tbss : { *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*) } - .rel.ctors : { *(.rel.ctors) } - .rel.dtors : { *(.rel.dtors) } - .rel.got : { *(.rel.got) } - .rel.bss : { *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*) } - .rel.ifunc : { *(.rel.ifunc) } - .rel.plt : - { - *(.rel.plt) - PROVIDE_HIDDEN (__rel_iplt_start = .); - *(.rel.iplt) - PROVIDE_HIDDEN (__rel_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = .; - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xc b/x86_64-linux-android/lib/ldscripts/elf_i386.xc deleted file mode 100644 index 41f2a97..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xc +++ /dev/null @@ -1,211 +0,0 @@ -/* Script for -z combreloc: combine and sort reloc sections */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x08048000); . = 0x08048000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rel.dyn : - { - *(.rel.init) - *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*) - *(.rel.fini) - *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*) - *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*) - *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*) - *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*) - *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*) - *(.rel.ctors) - *(.rel.dtors) - *(.rel.got) - *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*) - *(.rel.ifunc) - } - .rel.plt : - { - *(.rel.plt) - PROVIDE_HIDDEN (__rel_iplt_start = .); - *(.rel.iplt) - PROVIDE_HIDDEN (__rel_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 12 ? 12 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xd b/x86_64-linux-android/lib/ldscripts/elf_i386.xd deleted file mode 100644 index 62119c1..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xd +++ /dev/null @@ -1,208 +0,0 @@ -/* Script for ld -pie: link position independent executable */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rel.init : { *(.rel.init) } - .rel.text : { *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*) } - .rel.fini : { *(.rel.fini) } - .rel.rodata : { *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*) } - .rel.data.rel.ro : { *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*) } - .rel.data : { *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*) } - .rel.tdata : { *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*) } - .rel.tbss : { *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*) } - .rel.ctors : { *(.rel.ctors) } - .rel.dtors : { *(.rel.dtors) } - .rel.got : { *(.rel.got) } - .rel.bss : { *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*) } - .rel.ifunc : { *(.rel.ifunc) } - .rel.plt : - { - *(.rel.plt) - PROVIDE_HIDDEN (__rel_iplt_start = .); - *(.rel.iplt) - PROVIDE_HIDDEN (__rel_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 12 ? 12 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xdc b/x86_64-linux-android/lib/ldscripts/elf_i386.xdc deleted file mode 100644 index e7d5332..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xdc +++ /dev/null @@ -1,211 +0,0 @@ -/* Script for -pie -z combreloc: position independent executable, combine & sort relocs */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rel.dyn : - { - *(.rel.init) - *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*) - *(.rel.fini) - *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*) - *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*) - *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*) - *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*) - *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*) - *(.rel.ctors) - *(.rel.dtors) - *(.rel.got) - *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*) - *(.rel.ifunc) - } - .rel.plt : - { - *(.rel.plt) - PROVIDE_HIDDEN (__rel_iplt_start = .); - *(.rel.iplt) - PROVIDE_HIDDEN (__rel_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 12 ? 12 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xdw b/x86_64-linux-android/lib/ldscripts/elf_i386.xdw deleted file mode 100644 index caf905a..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xdw +++ /dev/null @@ -1,210 +0,0 @@ -/* Script for -pie -z combreloc -z now -z relro: position independent executable, combine & sort relocs */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rel.dyn : - { - *(.rel.init) - *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*) - *(.rel.fini) - *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*) - *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*) - *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*) - *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*) - *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*) - *(.rel.ctors) - *(.rel.dtors) - *(.rel.got) - *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*) - *(.rel.ifunc) - } - .rel.plt : - { - *(.rel.plt) - PROVIDE_HIDDEN (__rel_iplt_start = .); - *(.rel.iplt) - PROVIDE_HIDDEN (__rel_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xn b/x86_64-linux-android/lib/ldscripts/elf_i386.xn deleted file mode 100644 index df0b9c7..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xn +++ /dev/null @@ -1,208 +0,0 @@ -/* Script for -n: mix text and data on same page */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x08048000); . = 0x08048000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rel.init : { *(.rel.init) } - .rel.text : { *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*) } - .rel.fini : { *(.rel.fini) } - .rel.rodata : { *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*) } - .rel.data.rel.ro : { *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*) } - .rel.data : { *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*) } - .rel.tdata : { *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*) } - .rel.tbss : { *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*) } - .rel.ctors : { *(.rel.ctors) } - .rel.dtors : { *(.rel.dtors) } - .rel.got : { *(.rel.got) } - .rel.bss : { *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*) } - .rel.ifunc : { *(.rel.ifunc) } - .rel.plt : - { - *(.rel.plt) - PROVIDE_HIDDEN (__rel_iplt_start = .); - *(.rel.iplt) - PROVIDE_HIDDEN (__rel_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 12 ? 12 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xr b/x86_64-linux-android/lib/ldscripts/elf_i386.xr deleted file mode 100644 index 6a84459..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xr +++ /dev/null @@ -1,136 +0,0 @@ -/* Script for ld -r: link without relocation */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) - /* For some reason, the Solaris linker makes bad executables - if gld -r is used and the intermediate file has sections starting - at non-zero addresses. Could be a Solaris ld bug, could be a GNU ld - bug. But for now assigning the zero vmas works. */ -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - .interp 0 : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash 0 : { *(.hash) } - .gnu.hash 0 : { *(.gnu.hash) } - .dynsym 0 : { *(.dynsym) } - .dynstr 0 : { *(.dynstr) } - .gnu.version 0 : { *(.gnu.version) } - .gnu.version_d 0: { *(.gnu.version_d) } - .gnu.version_r 0: { *(.gnu.version_r) } - .rel.init 0 : { *(.rel.init) } - .rel.text 0 : { *(.rel.text) } - .rel.fini 0 : { *(.rel.fini) } - .rel.rodata 0 : { *(.rel.rodata) } - .rel.data.rel.ro 0 : { *(.rel.data.rel.ro) } - .rel.data 0 : { *(.rel.data) } - .rel.tdata 0 : { *(.rel.tdata) } - .rel.tbss 0 : { *(.rel.tbss) } - .rel.ctors 0 : { *(.rel.ctors) } - .rel.dtors 0 : { *(.rel.dtors) } - .rel.got 0 : { *(.rel.got) } - .rel.bss 0 : { *(.rel.bss) } - .rel.ifunc 0 : { *(.rel.ifunc) } - .rel.plt 0 : - { - *(.rel.plt) - } - .init 0 : - { - KEEP (*(SORT_NONE(.init))) - } - .plt 0 : { *(.plt) *(.iplt) } - .text 0 : - { - *(.text .stub) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini 0 : - { - KEEP (*(SORT_NONE(.fini))) - } - .rodata 0 : { *(.rodata) } - .rodata1 0 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame 0 : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges 0 : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - /* Exception handling */ - .eh_frame 0 : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges 0 : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata 0 : { *(.tdata) } - .tbss 0 : { *(.tbss) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - .preinit_array 0 : - { - KEEP (*(.preinit_array)) - } - .jcr 0 : { KEEP (*(.jcr)) } - .dynamic 0 : { *(.dynamic) } - .got 0 : { *(.got) *(.igot) } - .got.plt 0 : { *(.got.plt) *(.igot.plt) } - .data 0 : - { - *(.data) - } - .data1 0 : { *(.data1) } - .bss 0 : - { - *(.dynbss) - *(.bss) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - } - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xs b/x86_64-linux-android/lib/ldscripts/elf_i386.xs deleted file mode 100644 index b534aa7..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xs +++ /dev/null @@ -1,199 +0,0 @@ -/* Script for ld --shared: link shared library */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rel.init : { *(.rel.init) } - .rel.text : { *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*) } - .rel.fini : { *(.rel.fini) } - .rel.rodata : { *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*) } - .rel.data.rel.ro : { *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*) } - .rel.data : { *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*) } - .rel.tdata : { *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*) } - .rel.tbss : { *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*) } - .rel.ctors : { *(.rel.ctors) } - .rel.dtors : { *(.rel.dtors) } - .rel.got : { *(.rel.got) } - .rel.bss : { *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*) } - .rel.ifunc : { *(.rel.ifunc) } - .rel.plt : - { - *(.rel.plt) - *(.rel.iplt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 12 ? 12 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xsc b/x86_64-linux-android/lib/ldscripts/elf_i386.xsc deleted file mode 100644 index ab707bf..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xsc +++ /dev/null @@ -1,203 +0,0 @@ -/* Script for --shared -z combreloc: shared library, combine & sort relocs */ -/* Modified for Android. */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rel.dyn : - { - *(.rel.init) - *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*) - *(.rel.fini) - *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*) - *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*) - *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*) - *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*) - *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*) - *(.rel.ctors) - *(.rel.dtors) - *(.rel.got) - *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*) - *(.rel.ifunc) - } - .rel.plt : - { - *(.rel.plt) - *(.rel.iplt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 12 ? 12 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xsw b/x86_64-linux-android/lib/ldscripts/elf_i386.xsw deleted file mode 100644 index eb92bf4..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xsw +++ /dev/null @@ -1,201 +0,0 @@ -/* Script for --shared -z combreloc -z now -z relro: shared library, combine & sort relocs */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rel.dyn : - { - *(.rel.init) - *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*) - *(.rel.fini) - *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*) - *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*) - *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*) - *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*) - *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*) - *(.rel.ctors) - *(.rel.dtors) - *(.rel.got) - *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*) - *(.rel.ifunc) - } - .rel.plt : - { - *(.rel.plt) - *(.rel.iplt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xu b/x86_64-linux-android/lib/ldscripts/elf_i386.xu deleted file mode 100644 index 42cca6d..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xu +++ /dev/null @@ -1,137 +0,0 @@ -/* Script for ld -Ur: link w/out relocation, do create constructors */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) - /* For some reason, the Solaris linker makes bad executables - if gld -r is used and the intermediate file has sections starting - at non-zero addresses. Could be a Solaris ld bug, could be a GNU ld - bug. But for now assigning the zero vmas works. */ -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - .interp 0 : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash 0 : { *(.hash) } - .gnu.hash 0 : { *(.gnu.hash) } - .dynsym 0 : { *(.dynsym) } - .dynstr 0 : { *(.dynstr) } - .gnu.version 0 : { *(.gnu.version) } - .gnu.version_d 0: { *(.gnu.version_d) } - .gnu.version_r 0: { *(.gnu.version_r) } - .rel.init 0 : { *(.rel.init) } - .rel.text 0 : { *(.rel.text) } - .rel.fini 0 : { *(.rel.fini) } - .rel.rodata 0 : { *(.rel.rodata) } - .rel.data.rel.ro 0 : { *(.rel.data.rel.ro) } - .rel.data 0 : { *(.rel.data) } - .rel.tdata 0 : { *(.rel.tdata) } - .rel.tbss 0 : { *(.rel.tbss) } - .rel.ctors 0 : { *(.rel.ctors) } - .rel.dtors 0 : { *(.rel.dtors) } - .rel.got 0 : { *(.rel.got) } - .rel.bss 0 : { *(.rel.bss) } - .rel.ifunc 0 : { *(.rel.ifunc) } - .rel.plt 0 : - { - *(.rel.plt) - } - .init 0 : - { - KEEP (*(SORT_NONE(.init))) - } - .plt 0 : { *(.plt) *(.iplt) } - .text 0 : - { - *(.text .stub) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini 0 : - { - KEEP (*(SORT_NONE(.fini))) - } - .rodata 0 : { *(.rodata) } - .rodata1 0 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame 0 : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges 0 : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - /* Exception handling */ - .eh_frame 0 : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges 0 : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata 0 : { *(.tdata) } - .tbss 0 : { *(.tbss) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - .preinit_array 0 : - { - KEEP (*(.preinit_array)) - } - .jcr 0 : { KEEP (*(.jcr)) } - .dynamic 0 : { *(.dynamic) } - .got 0 : { *(.got) *(.igot) } - .got.plt 0 : { *(.got.plt) *(.igot.plt) } - .data 0 : - { - *(.data) - SORT(CONSTRUCTORS) - } - .data1 0 : { *(.data1) } - .bss 0 : - { - *(.dynbss) - *(.bss) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - } - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_i386.xw b/x86_64-linux-android/lib/ldscripts/elf_i386.xw deleted file mode 100644 index f25dd62..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_i386.xw +++ /dev/null @@ -1,210 +0,0 @@ -/* Script for -z combreloc -z now -z relro: combine and sort reloc sections */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", - "elf32-i386") -OUTPUT_ARCH(i386) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x08048000); . = 0x08048000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rel.dyn : - { - *(.rel.init) - *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*) - *(.rel.fini) - *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*) - *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*) - *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*) - *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*) - *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*) - *(.rel.ctors) - *(.rel.dtors) - *(.rel.got) - *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*) - *(.rel.ifunc) - } - .rel.plt : - { - *(.rel.plt) - PROVIDE_HIDDEN (__rel_iplt_start = .); - *(.rel.iplt) - PROVIDE_HIDDEN (__rel_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(32 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(32 / 8); - } - . = ALIGN(32 / 8); - . = SEGMENT_START("ldata-segment", .); - . = ALIGN(32 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.x b/x86_64-linux-android/lib/ldscripts/elf_k1om.x deleted file mode 100644 index 2990d71..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.x +++ /dev/null @@ -1,230 +0,0 @@ -/* Default linker script, for normal executables */ -/* Modified for Android. */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.iplt : - { - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xbn b/x86_64-linux-android/lib/ldscripts/elf_k1om.xbn deleted file mode 100644 index c0d2912..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xbn +++ /dev/null @@ -1,227 +0,0 @@ -/* Script for -N: mix text and data on same page; don't align data */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.iplt : - { - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = .; - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xc b/x86_64-linux-android/lib/ldscripts/elf_k1om.xc deleted file mode 100644 index 19b532a..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xc +++ /dev/null @@ -1,230 +0,0 @@ -/* Script for -z combreloc: combine and sort reloc sections */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - PROVIDE_HIDDEN (__rel_iplt_start = .); - PROVIDE_HIDDEN (__rel_iplt_end = .); - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xd b/x86_64-linux-android/lib/ldscripts/elf_k1om.xd deleted file mode 100644 index b6b3a99..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xd +++ /dev/null @@ -1,229 +0,0 @@ -/* Script for ld -pie: link position independent executable */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.iplt : - { - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xdc b/x86_64-linux-android/lib/ldscripts/elf_k1om.xdc deleted file mode 100644 index 78d8292..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xdc +++ /dev/null @@ -1,230 +0,0 @@ -/* Script for -pie -z combreloc: position independent executable, combine & sort relocs */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - PROVIDE_HIDDEN (__rel_iplt_start = .); - PROVIDE_HIDDEN (__rel_iplt_end = .); - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xdw b/x86_64-linux-android/lib/ldscripts/elf_k1om.xdw deleted file mode 100644 index bd1d78f..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xdw +++ /dev/null @@ -1,229 +0,0 @@ -/* Script for -pie -z combreloc -z now -z relro: position independent executable, combine & sort relocs */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - PROVIDE_HIDDEN (__rel_iplt_start = .); - PROVIDE_HIDDEN (__rel_iplt_end = .); - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xn b/x86_64-linux-android/lib/ldscripts/elf_k1om.xn deleted file mode 100644 index 1f64c0a..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xn +++ /dev/null @@ -1,229 +0,0 @@ -/* Script for -n: mix text and data on same page */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.iplt : - { - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xr b/x86_64-linux-android/lib/ldscripts/elf_k1om.xr deleted file mode 100644 index fb38fee..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xr +++ /dev/null @@ -1,157 +0,0 @@ -/* Script for ld -r: link without relocation */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) - /* For some reason, the Solaris linker makes bad executables - if gld -r is used and the intermediate file has sections starting - at non-zero addresses. Could be a Solaris ld bug, could be a GNU ld - bug. But for now assigning the zero vmas works. */ -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - .interp 0 : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash 0 : { *(.hash) } - .gnu.hash 0 : { *(.gnu.hash) } - .dynsym 0 : { *(.dynsym) } - .dynstr 0 : { *(.dynstr) } - .gnu.version 0 : { *(.gnu.version) } - .gnu.version_d 0: { *(.gnu.version_d) } - .gnu.version_r 0: { *(.gnu.version_r) } - .rela.init 0 : { *(.rela.init) } - .rela.text 0 : { *(.rela.text) } - .rela.fini 0 : { *(.rela.fini) } - .rela.rodata 0 : { *(.rela.rodata) } - .rela.data.rel.ro 0 : { *(.rela.data.rel.ro) } - .rela.data 0 : { *(.rela.data) } - .rela.tdata 0 : { *(.rela.tdata) } - .rela.tbss 0 : { *(.rela.tbss) } - .rela.ctors 0 : { *(.rela.ctors) } - .rela.dtors 0 : { *(.rela.dtors) } - .rela.got 0 : { *(.rela.got) } - .rela.bss 0 : { *(.rela.bss) } - .rela.ldata 0 : { *(.rela.ldata) } - .rela.lbss 0 : { *(.rela.lbss) } - .rela.lrodata 0 : { *(.rela.lrodata) } - .rela.iplt 0 : - { - *(.rela.iplt) - } - .rela.plt 0 : - { - *(.rela.plt) - } - .init 0 : - { - KEEP (*(SORT_NONE(.init))) - } - .plt 0 : { *(.plt) } - .iplt 0 : { *(.iplt) } - .text 0 : - { - *(.text .stub) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini 0 : - { - KEEP (*(SORT_NONE(.fini))) - } - .rodata 0 : { *(.rodata) } - .rodata1 0 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame 0 : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges 0 : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - /* Exception handling */ - .eh_frame 0 : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges 0 : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata 0 : { *(.tdata) } - .tbss 0 : { *(.tbss) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - .preinit_array 0 : - { - KEEP (*(.preinit_array)) - } - .jcr 0 : { KEEP (*(.jcr)) } - .dynamic 0 : { *(.dynamic) } - .got 0 : { *(.got) *(.igot) } - .got.plt 0 : { *(.got.plt) *(.igot.plt) } - .data 0 : - { - *(.data) - } - .data1 0 : { *(.data1) } - .bss 0 : - { - *(.dynbss) - *(.bss) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - } - .lbss 0 : - { - *(.dynlbss) - *(.lbss) - *(LARGE_COMMON) - } - .lrodata 0 : - { - *(.lrodata) - } - .ldata 0 : - { - *(.ldata) - } - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xs b/x86_64-linux-android/lib/ldscripts/elf_k1om.xs deleted file mode 100644 index 7acdd5d..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xs +++ /dev/null @@ -1,220 +0,0 @@ -/* Script for ld --shared: link shared library */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.iplt : - { - *(.rela.iplt) - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xsc b/x86_64-linux-android/lib/ldscripts/elf_k1om.xsc deleted file mode 100644 index 03787d8..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xsc +++ /dev/null @@ -1,220 +0,0 @@ -/* Script for --shared -z combreloc: shared library, combine & sort relocs */ -/* Modified for Android. */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.iplt) - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xsw b/x86_64-linux-android/lib/ldscripts/elf_k1om.xsw deleted file mode 100644 index 82edf36..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xsw +++ /dev/null @@ -1,218 +0,0 @@ -/* Script for --shared -z combreloc -z now -z relro: shared library, combine & sort relocs */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.iplt) - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xu b/x86_64-linux-android/lib/ldscripts/elf_k1om.xu deleted file mode 100644 index e78e1c7..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xu +++ /dev/null @@ -1,158 +0,0 @@ -/* Script for ld -Ur: link w/out relocation, do create constructors */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) - /* For some reason, the Solaris linker makes bad executables - if gld -r is used and the intermediate file has sections starting - at non-zero addresses. Could be a Solaris ld bug, could be a GNU ld - bug. But for now assigning the zero vmas works. */ -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - .interp 0 : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash 0 : { *(.hash) } - .gnu.hash 0 : { *(.gnu.hash) } - .dynsym 0 : { *(.dynsym) } - .dynstr 0 : { *(.dynstr) } - .gnu.version 0 : { *(.gnu.version) } - .gnu.version_d 0: { *(.gnu.version_d) } - .gnu.version_r 0: { *(.gnu.version_r) } - .rela.init 0 : { *(.rela.init) } - .rela.text 0 : { *(.rela.text) } - .rela.fini 0 : { *(.rela.fini) } - .rela.rodata 0 : { *(.rela.rodata) } - .rela.data.rel.ro 0 : { *(.rela.data.rel.ro) } - .rela.data 0 : { *(.rela.data) } - .rela.tdata 0 : { *(.rela.tdata) } - .rela.tbss 0 : { *(.rela.tbss) } - .rela.ctors 0 : { *(.rela.ctors) } - .rela.dtors 0 : { *(.rela.dtors) } - .rela.got 0 : { *(.rela.got) } - .rela.bss 0 : { *(.rela.bss) } - .rela.ldata 0 : { *(.rela.ldata) } - .rela.lbss 0 : { *(.rela.lbss) } - .rela.lrodata 0 : { *(.rela.lrodata) } - .rela.iplt 0 : - { - *(.rela.iplt) - } - .rela.plt 0 : - { - *(.rela.plt) - } - .init 0 : - { - KEEP (*(SORT_NONE(.init))) - } - .plt 0 : { *(.plt) } - .iplt 0 : { *(.iplt) } - .text 0 : - { - *(.text .stub) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini 0 : - { - KEEP (*(SORT_NONE(.fini))) - } - .rodata 0 : { *(.rodata) } - .rodata1 0 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame 0 : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges 0 : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - /* Exception handling */ - .eh_frame 0 : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges 0 : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata 0 : { *(.tdata) } - .tbss 0 : { *(.tbss) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - .preinit_array 0 : - { - KEEP (*(.preinit_array)) - } - .jcr 0 : { KEEP (*(.jcr)) } - .dynamic 0 : { *(.dynamic) } - .got 0 : { *(.got) *(.igot) } - .got.plt 0 : { *(.got.plt) *(.igot.plt) } - .data 0 : - { - *(.data) - SORT(CONSTRUCTORS) - } - .data1 0 : { *(.data1) } - .bss 0 : - { - *(.dynbss) - *(.bss) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - } - .lbss 0 : - { - *(.dynlbss) - *(.lbss) - *(LARGE_COMMON) - } - .lrodata 0 : - { - *(.lrodata) - } - .ldata 0 : - { - *(.ldata) - } - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_k1om.xw b/x86_64-linux-android/lib/ldscripts/elf_k1om.xw deleted file mode 100644 index 2b43914..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_k1om.xw +++ /dev/null @@ -1,229 +0,0 @@ -/* Script for -z combreloc -z now -z relro: combine and sort reloc sections */ -OUTPUT_FORMAT("elf64-k1om", "elf64-k1om", - "elf64-k1om") -OUTPUT_ARCH(k1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - PROVIDE_HIDDEN (__rel_iplt_start = .); - PROVIDE_HIDDEN (__rel_iplt_end = .); - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.x b/x86_64-linux-android/lib/ldscripts/elf_l1om.x deleted file mode 100644 index 7794a13..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.x +++ /dev/null @@ -1,230 +0,0 @@ -/* Default linker script, for normal executables */ -/* Modified for Android. */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.iplt : - { - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xbn b/x86_64-linux-android/lib/ldscripts/elf_l1om.xbn deleted file mode 100644 index b6befd6..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xbn +++ /dev/null @@ -1,227 +0,0 @@ -/* Script for -N: mix text and data on same page; don't align data */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.iplt : - { - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = .; - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xc b/x86_64-linux-android/lib/ldscripts/elf_l1om.xc deleted file mode 100644 index 1b05981..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xc +++ /dev/null @@ -1,230 +0,0 @@ -/* Script for -z combreloc: combine and sort reloc sections */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - PROVIDE_HIDDEN (__rel_iplt_start = .); - PROVIDE_HIDDEN (__rel_iplt_end = .); - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xd b/x86_64-linux-android/lib/ldscripts/elf_l1om.xd deleted file mode 100644 index 89fb0e4..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xd +++ /dev/null @@ -1,229 +0,0 @@ -/* Script for ld -pie: link position independent executable */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.iplt : - { - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xdc b/x86_64-linux-android/lib/ldscripts/elf_l1om.xdc deleted file mode 100644 index b91a287..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xdc +++ /dev/null @@ -1,230 +0,0 @@ -/* Script for -pie -z combreloc: position independent executable, combine & sort relocs */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - PROVIDE_HIDDEN (__rel_iplt_start = .); - PROVIDE_HIDDEN (__rel_iplt_end = .); - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xdw b/x86_64-linux-android/lib/ldscripts/elf_l1om.xdw deleted file mode 100644 index 8a8704b..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xdw +++ /dev/null @@ -1,229 +0,0 @@ -/* Script for -pie -z combreloc -z now -z relro: position independent executable, combine & sort relocs */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - PROVIDE_HIDDEN (__rel_iplt_start = .); - PROVIDE_HIDDEN (__rel_iplt_end = .); - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xn b/x86_64-linux-android/lib/ldscripts/elf_l1om.xn deleted file mode 100644 index 4d6db18..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xn +++ /dev/null @@ -1,229 +0,0 @@ -/* Script for -n: mix text and data on same page */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.iplt : - { - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xr b/x86_64-linux-android/lib/ldscripts/elf_l1om.xr deleted file mode 100644 index ead9c53..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xr +++ /dev/null @@ -1,157 +0,0 @@ -/* Script for ld -r: link without relocation */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) - /* For some reason, the Solaris linker makes bad executables - if gld -r is used and the intermediate file has sections starting - at non-zero addresses. Could be a Solaris ld bug, could be a GNU ld - bug. But for now assigning the zero vmas works. */ -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - .interp 0 : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash 0 : { *(.hash) } - .gnu.hash 0 : { *(.gnu.hash) } - .dynsym 0 : { *(.dynsym) } - .dynstr 0 : { *(.dynstr) } - .gnu.version 0 : { *(.gnu.version) } - .gnu.version_d 0: { *(.gnu.version_d) } - .gnu.version_r 0: { *(.gnu.version_r) } - .rela.init 0 : { *(.rela.init) } - .rela.text 0 : { *(.rela.text) } - .rela.fini 0 : { *(.rela.fini) } - .rela.rodata 0 : { *(.rela.rodata) } - .rela.data.rel.ro 0 : { *(.rela.data.rel.ro) } - .rela.data 0 : { *(.rela.data) } - .rela.tdata 0 : { *(.rela.tdata) } - .rela.tbss 0 : { *(.rela.tbss) } - .rela.ctors 0 : { *(.rela.ctors) } - .rela.dtors 0 : { *(.rela.dtors) } - .rela.got 0 : { *(.rela.got) } - .rela.bss 0 : { *(.rela.bss) } - .rela.ldata 0 : { *(.rela.ldata) } - .rela.lbss 0 : { *(.rela.lbss) } - .rela.lrodata 0 : { *(.rela.lrodata) } - .rela.iplt 0 : - { - *(.rela.iplt) - } - .rela.plt 0 : - { - *(.rela.plt) - } - .init 0 : - { - KEEP (*(SORT_NONE(.init))) - } - .plt 0 : { *(.plt) } - .iplt 0 : { *(.iplt) } - .text 0 : - { - *(.text .stub) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini 0 : - { - KEEP (*(SORT_NONE(.fini))) - } - .rodata 0 : { *(.rodata) } - .rodata1 0 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame 0 : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges 0 : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - /* Exception handling */ - .eh_frame 0 : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges 0 : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata 0 : { *(.tdata) } - .tbss 0 : { *(.tbss) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - .preinit_array 0 : - { - KEEP (*(.preinit_array)) - } - .jcr 0 : { KEEP (*(.jcr)) } - .dynamic 0 : { *(.dynamic) } - .got 0 : { *(.got) *(.igot) } - .got.plt 0 : { *(.got.plt) *(.igot.plt) } - .data 0 : - { - *(.data) - } - .data1 0 : { *(.data1) } - .bss 0 : - { - *(.dynbss) - *(.bss) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - } - .lbss 0 : - { - *(.dynlbss) - *(.lbss) - *(LARGE_COMMON) - } - .lrodata 0 : - { - *(.lrodata) - } - .ldata 0 : - { - *(.ldata) - } - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xs b/x86_64-linux-android/lib/ldscripts/elf_l1om.xs deleted file mode 100644 index d9cbe01..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xs +++ /dev/null @@ -1,220 +0,0 @@ -/* Script for ld --shared: link shared library */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.iplt : - { - *(.rela.iplt) - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xsc b/x86_64-linux-android/lib/ldscripts/elf_l1om.xsc deleted file mode 100644 index e252f9f..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xsc +++ /dev/null @@ -1,220 +0,0 @@ -/* Script for --shared -z combreloc: shared library, combine & sort relocs */ -/* Modified for Android. */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.iplt) - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xsw b/x86_64-linux-android/lib/ldscripts/elf_l1om.xsw deleted file mode 100644 index f583d68..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xsw +++ /dev/null @@ -1,218 +0,0 @@ -/* Script for --shared -z combreloc -z now -z relro: shared library, combine & sort relocs */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.iplt) - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xu b/x86_64-linux-android/lib/ldscripts/elf_l1om.xu deleted file mode 100644 index c6247bb..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xu +++ /dev/null @@ -1,158 +0,0 @@ -/* Script for ld -Ur: link w/out relocation, do create constructors */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) - /* For some reason, the Solaris linker makes bad executables - if gld -r is used and the intermediate file has sections starting - at non-zero addresses. Could be a Solaris ld bug, could be a GNU ld - bug. But for now assigning the zero vmas works. */ -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - .interp 0 : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash 0 : { *(.hash) } - .gnu.hash 0 : { *(.gnu.hash) } - .dynsym 0 : { *(.dynsym) } - .dynstr 0 : { *(.dynstr) } - .gnu.version 0 : { *(.gnu.version) } - .gnu.version_d 0: { *(.gnu.version_d) } - .gnu.version_r 0: { *(.gnu.version_r) } - .rela.init 0 : { *(.rela.init) } - .rela.text 0 : { *(.rela.text) } - .rela.fini 0 : { *(.rela.fini) } - .rela.rodata 0 : { *(.rela.rodata) } - .rela.data.rel.ro 0 : { *(.rela.data.rel.ro) } - .rela.data 0 : { *(.rela.data) } - .rela.tdata 0 : { *(.rela.tdata) } - .rela.tbss 0 : { *(.rela.tbss) } - .rela.ctors 0 : { *(.rela.ctors) } - .rela.dtors 0 : { *(.rela.dtors) } - .rela.got 0 : { *(.rela.got) } - .rela.bss 0 : { *(.rela.bss) } - .rela.ldata 0 : { *(.rela.ldata) } - .rela.lbss 0 : { *(.rela.lbss) } - .rela.lrodata 0 : { *(.rela.lrodata) } - .rela.iplt 0 : - { - *(.rela.iplt) - } - .rela.plt 0 : - { - *(.rela.plt) - } - .init 0 : - { - KEEP (*(SORT_NONE(.init))) - } - .plt 0 : { *(.plt) } - .iplt 0 : { *(.iplt) } - .text 0 : - { - *(.text .stub) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini 0 : - { - KEEP (*(SORT_NONE(.fini))) - } - .rodata 0 : { *(.rodata) } - .rodata1 0 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame 0 : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges 0 : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - /* Exception handling */ - .eh_frame 0 : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges 0 : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata 0 : { *(.tdata) } - .tbss 0 : { *(.tbss) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - .preinit_array 0 : - { - KEEP (*(.preinit_array)) - } - .jcr 0 : { KEEP (*(.jcr)) } - .dynamic 0 : { *(.dynamic) } - .got 0 : { *(.got) *(.igot) } - .got.plt 0 : { *(.got.plt) *(.igot.plt) } - .data 0 : - { - *(.data) - SORT(CONSTRUCTORS) - } - .data1 0 : { *(.data1) } - .bss 0 : - { - *(.dynbss) - *(.bss) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - } - .lbss 0 : - { - *(.dynlbss) - *(.lbss) - *(LARGE_COMMON) - } - .lrodata 0 : - { - *(.lrodata) - } - .ldata 0 : - { - *(.ldata) - } - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_l1om.xw b/x86_64-linux-android/lib/ldscripts/elf_l1om.xw deleted file mode 100644 index 44ebc93..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_l1om.xw +++ /dev/null @@ -1,229 +0,0 @@ -/* Script for -z combreloc -z now -z relro: combine and sort reloc sections */ -OUTPUT_FORMAT("elf64-l1om", "elf64-l1om", - "elf64-l1om") -OUTPUT_ARCH(l1om) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - PROVIDE_HIDDEN (__rel_iplt_start = .); - PROVIDE_HIDDEN (__rel_iplt_end = .); - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .rela.plt : - { - *(.rela.plt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) } - .iplt : { *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.x b/x86_64-linux-android/lib/ldscripts/elf_x86_64.x deleted file mode 100644 index 575d2ab..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.x +++ /dev/null @@ -1,227 +0,0 @@ -/* Default linker script, for normal executables */ -/* Modified for Android. */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.ifunc : { *(.rela.ifunc) } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xbn b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xbn deleted file mode 100644 index f0f36ad..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xbn +++ /dev/null @@ -1,224 +0,0 @@ -/* Script for -N: mix text and data on same page; don't align data */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.ifunc : { *(.rela.ifunc) } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = .; - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xc b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xc deleted file mode 100644 index 9ac8216..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xc +++ /dev/null @@ -1,228 +0,0 @@ -/* Script for -z combreloc: combine and sort reloc sections */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xd b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xd deleted file mode 100644 index 944b71a..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xd +++ /dev/null @@ -1,226 +0,0 @@ -/* Script for ld -pie: link position independent executable */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.ifunc : { *(.rela.ifunc) } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xdc b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xdc deleted file mode 100644 index 4b8571a..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xdc +++ /dev/null @@ -1,228 +0,0 @@ -/* Script for -pie -z combreloc: position independent executable, combine & sort relocs */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xdw b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xdw deleted file mode 100644 index c5e0ce0..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xdw +++ /dev/null @@ -1,227 +0,0 @@ -/* Script for -pie -z combreloc -z now -z relro: position independent executable, combine & sort relocs */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0); . = 0 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xn b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xn deleted file mode 100644 index 6fde39b..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xn +++ /dev/null @@ -1,226 +0,0 @@ -/* Script for -n: mix text and data on same page */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.ifunc : { *(.rela.ifunc) } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xr b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xr deleted file mode 100644 index b7f1e1c..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xr +++ /dev/null @@ -1,153 +0,0 @@ -/* Script for ld -r: link without relocation */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) - /* For some reason, the Solaris linker makes bad executables - if gld -r is used and the intermediate file has sections starting - at non-zero addresses. Could be a Solaris ld bug, could be a GNU ld - bug. But for now assigning the zero vmas works. */ -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - .interp 0 : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash 0 : { *(.hash) } - .gnu.hash 0 : { *(.gnu.hash) } - .dynsym 0 : { *(.dynsym) } - .dynstr 0 : { *(.dynstr) } - .gnu.version 0 : { *(.gnu.version) } - .gnu.version_d 0: { *(.gnu.version_d) } - .gnu.version_r 0: { *(.gnu.version_r) } - .rela.init 0 : { *(.rela.init) } - .rela.text 0 : { *(.rela.text) } - .rela.fini 0 : { *(.rela.fini) } - .rela.rodata 0 : { *(.rela.rodata) } - .rela.data.rel.ro 0 : { *(.rela.data.rel.ro) } - .rela.data 0 : { *(.rela.data) } - .rela.tdata 0 : { *(.rela.tdata) } - .rela.tbss 0 : { *(.rela.tbss) } - .rela.ctors 0 : { *(.rela.ctors) } - .rela.dtors 0 : { *(.rela.dtors) } - .rela.got 0 : { *(.rela.got) } - .rela.bss 0 : { *(.rela.bss) } - .rela.ldata 0 : { *(.rela.ldata) } - .rela.lbss 0 : { *(.rela.lbss) } - .rela.lrodata 0 : { *(.rela.lrodata) } - .rela.ifunc 0 : { *(.rela.ifunc) } - .rela.plt 0 : - { - *(.rela.plt) - } - .init 0 : - { - KEEP (*(SORT_NONE(.init))) - } - .plt 0 : { *(.plt) *(.iplt) } - .text 0 : - { - *(.text .stub) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini 0 : - { - KEEP (*(SORT_NONE(.fini))) - } - .rodata 0 : { *(.rodata) } - .rodata1 0 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame 0 : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges 0 : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - /* Exception handling */ - .eh_frame 0 : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges 0 : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata 0 : { *(.tdata) } - .tbss 0 : { *(.tbss) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - .preinit_array 0 : - { - KEEP (*(.preinit_array)) - } - .jcr 0 : { KEEP (*(.jcr)) } - .dynamic 0 : { *(.dynamic) } - .got 0 : { *(.got) *(.igot) } - .got.plt 0 : { *(.got.plt) *(.igot.plt) } - .data 0 : - { - *(.data) - } - .data1 0 : { *(.data1) } - .bss 0 : - { - *(.dynbss) - *(.bss) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - } - .lbss 0 : - { - *(.dynlbss) - *(.lbss) - *(LARGE_COMMON) - } - .lrodata 0 : - { - *(.lrodata) - } - .ldata 0 : - { - *(.ldata) - } - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xs b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xs deleted file mode 100644 index 2d2b8ff..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xs +++ /dev/null @@ -1,217 +0,0 @@ -/* Script for ld --shared: link shared library */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.init : { *(.rela.init) } - .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) } - .rela.fini : { *(.rela.fini) } - .rela.rodata : { *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) } - .rela.data.rel.ro : { *(.rela.data.rel.ro .rela.data.rel.ro.* .rela.gnu.linkonce.d.rel.ro.*) } - .rela.data : { *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) } - .rela.tdata : { *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) } - .rela.tbss : { *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) } - .rela.ctors : { *(.rela.ctors) } - .rela.dtors : { *(.rela.dtors) } - .rela.got : { *(.rela.got) } - .rela.bss : { *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) } - .rela.ldata : { *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) } - .rela.lbss : { *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) } - .rela.lrodata : { *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) } - .rela.ifunc : { *(.rela.ifunc) } - .rela.plt : - { - *(.rela.plt) - *(.rela.iplt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xsc b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xsc deleted file mode 100644 index 5c41a31..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xsc +++ /dev/null @@ -1,220 +0,0 @@ -/* Script for --shared -z combreloc: shared library, combine & sort relocs */ -/* Modified for Android. */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - *(.rela.iplt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xsw b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xsw deleted file mode 100644 index 3f77038..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xsw +++ /dev/null @@ -1,218 +0,0 @@ -/* Script for --shared -z combreloc -z now -z relro: shared library, combine & sort relocs */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - . = 0 + SIZEOF_HEADERS; - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - *(.rela.iplt) - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xu b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xu deleted file mode 100644 index 07a5422..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xu +++ /dev/null @@ -1,154 +0,0 @@ -/* Script for ld -Ur: link w/out relocation, do create constructors */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) - /* For some reason, the Solaris linker makes bad executables - if gld -r is used and the intermediate file has sections starting - at non-zero addresses. Could be a Solaris ld bug, could be a GNU ld - bug. But for now assigning the zero vmas works. */ -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - .interp 0 : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash 0 : { *(.hash) } - .gnu.hash 0 : { *(.gnu.hash) } - .dynsym 0 : { *(.dynsym) } - .dynstr 0 : { *(.dynstr) } - .gnu.version 0 : { *(.gnu.version) } - .gnu.version_d 0: { *(.gnu.version_d) } - .gnu.version_r 0: { *(.gnu.version_r) } - .rela.init 0 : { *(.rela.init) } - .rela.text 0 : { *(.rela.text) } - .rela.fini 0 : { *(.rela.fini) } - .rela.rodata 0 : { *(.rela.rodata) } - .rela.data.rel.ro 0 : { *(.rela.data.rel.ro) } - .rela.data 0 : { *(.rela.data) } - .rela.tdata 0 : { *(.rela.tdata) } - .rela.tbss 0 : { *(.rela.tbss) } - .rela.ctors 0 : { *(.rela.ctors) } - .rela.dtors 0 : { *(.rela.dtors) } - .rela.got 0 : { *(.rela.got) } - .rela.bss 0 : { *(.rela.bss) } - .rela.ldata 0 : { *(.rela.ldata) } - .rela.lbss 0 : { *(.rela.lbss) } - .rela.lrodata 0 : { *(.rela.lrodata) } - .rela.ifunc 0 : { *(.rela.ifunc) } - .rela.plt 0 : - { - *(.rela.plt) - } - .init 0 : - { - KEEP (*(SORT_NONE(.init))) - } - .plt 0 : { *(.plt) *(.iplt) } - .text 0 : - { - *(.text .stub) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini 0 : - { - KEEP (*(SORT_NONE(.fini))) - } - .rodata 0 : { *(.rodata) } - .rodata1 0 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame 0 : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges 0 : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - /* Exception handling */ - .eh_frame 0 : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table 0 : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges 0 : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata 0 : { *(.tdata) } - .tbss 0 : { *(.tbss) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - .preinit_array 0 : - { - KEEP (*(.preinit_array)) - } - .jcr 0 : { KEEP (*(.jcr)) } - .dynamic 0 : { *(.dynamic) } - .got 0 : { *(.got) *(.igot) } - .got.plt 0 : { *(.got.plt) *(.igot.plt) } - .data 0 : - { - *(.data) - SORT(CONSTRUCTORS) - } - .data1 0 : { *(.data1) } - .bss 0 : - { - *(.dynbss) - *(.bss) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - } - .lbss 0 : - { - *(.dynlbss) - *(.lbss) - *(LARGE_COMMON) - } - .lrodata 0 : - { - *(.lrodata) - } - .ldata 0 : - { - *(.ldata) - } - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } -} diff --git a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xw b/x86_64-linux-android/lib/ldscripts/elf_x86_64.xw deleted file mode 100644 index e131e33..0000000 --- a/x86_64-linux-android/lib/ldscripts/elf_x86_64.xw +++ /dev/null @@ -1,227 +0,0 @@ -/* Script for -z combreloc -z now -z relro: combine and sort reloc sections */ -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", - "elf64-x86-64") -OUTPUT_ARCH(i386:x86-64) -ENTRY(_start) -SECTIONS -{ - /* Read-only sections, merged into text segment: */ - PROVIDE (__executable_start = 0x400000); . = 0x400000 + SIZEOF_HEADERS; - .interp : { *(.interp) } - .note.gnu.build-id : { *(.note.gnu.build-id) } - .hash : { *(.hash) } - .gnu.hash : { *(.gnu.hash) } - .dynsym : { *(.dynsym) } - .dynstr : { *(.dynstr) } - .gnu.version : { *(.gnu.version) } - .gnu.version_d : { *(.gnu.version_d) } - .gnu.version_r : { *(.gnu.version_r) } - .rela.dyn : - { - *(.rela.init) - *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) - *(.rela.fini) - *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) - *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) - *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) - *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) - *(.rela.ctors) - *(.rela.dtors) - *(.rela.got) - *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) - *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*) - *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*) - *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*) - *(.rela.ifunc) - } - .rela.plt : - { - *(.rela.plt) - PROVIDE_HIDDEN (__rela_iplt_start = .); - *(.rela.iplt) - PROVIDE_HIDDEN (__rela_iplt_end = .); - } - .init : - { - KEEP (*(SORT_NONE(.init))) - } - .plt : { *(.plt) *(.iplt) } - .text : - { - *(.text.unlikely .text.*_unlikely .text.unlikely.*) - *(.text.exit .text.exit.*) - *(.text.startup .text.startup.*) - *(.text.hot .text.hot.*) - *(.text .stub .text.* .gnu.linkonce.t.*) - /* .gnu.warning sections are handled specially by elf32.em. */ - *(.gnu.warning) - } - .fini : - { - KEEP (*(SORT_NONE(.fini))) - } - PROVIDE (__etext = .); - PROVIDE (_etext = .); - PROVIDE (etext = .); - .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) } - .rodata1 : { *(.rodata1) } - .eh_frame_hdr : { *(.eh_frame_hdr) } - .eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RO { *(.gcc_except_table - .gcc_except_table.*) } - /* These sections are generated by the Sun/Oracle C++ compiler. */ - .exception_ranges : ONLY_IF_RO { *(.exception_ranges - .exception_ranges*) } - /* Adjust the address for the data segment. For 32 bits we want to align - at exactly a page boundary to make life easier for apriori. */ - . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); - /* Exception handling */ - .eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) } - .gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) } - .exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) } - /* Thread Local Storage sections */ - .tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) } - .tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) } - /* Ensure the __preinit_array_start label is properly aligned. We - could instead move the label definition inside the section, but - the linker would then create the section even if it turns out to - be empty, which isn't pretty. */ - . = ALIGN(64 / 8); - PROVIDE_HIDDEN (__preinit_array_start = .); - .preinit_array : - { - KEEP (*(.preinit_array)) - } - PROVIDE_HIDDEN (__preinit_array_end = .); - PROVIDE_HIDDEN (__init_array_start = .); - .init_array : - { - KEEP (*crtbegin*.o(.init_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*))) - KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .ctors)) - } - PROVIDE_HIDDEN (__init_array_end = .); - PROVIDE_HIDDEN (__fini_array_start = .); - .fini_array : - { - KEEP (*crtbegin*.o(.fini_array)) - KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*))) - KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin*.o *crtend.o *crtend*.o ) .dtors)) - } - PROVIDE_HIDDEN (__fini_array_end = .); - .ctors : - { - /* gcc uses crtbegin.o to find the start of - the constructors, so we make sure it is - first. Because this is a wildcard, it - doesn't matter if the user does not - actually link against crtbegin.o; the - linker won't look for a file to match a - wildcard. The wildcard also means that it - doesn't matter which directory crtbegin.o - is in. */ - KEEP (*crtbegin.o(.ctors)) - KEEP (*crtbegin*.o(.ctors)) - /* We don't want to include the .ctor section from - the crtend.o file until after the sorted ctors. - The .ctor section from the crtend file contains the - end of ctors marker and it must be last */ - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .ctors)) - KEEP (*(SORT(.ctors.*))) - KEEP (*(.ctors)) - } - .dtors : - { - KEEP (*crtbegin.o(.dtors)) - KEEP (*crtbegin*.o(.dtors)) - KEEP (*(EXCLUDE_FILE (*crtend.o *crtend*.o ) .dtors)) - KEEP (*(SORT(.dtors.*))) - KEEP (*(.dtors)) - } - .jcr : { KEEP (*(.jcr)) } - .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) } - .dynamic : { *(.dynamic) } - .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (0, .); - .data : - { - *(.data .data.* .gnu.linkonce.d.*) - SORT(CONSTRUCTORS) - } - .data1 : { *(.data1) } - _edata = .; PROVIDE (edata = .); - . = .; - __bss_start = .; - .bss : - { - *(.dynbss) - *(.bss .bss.* .gnu.linkonce.b.*) - *(COMMON) - /* Align here to ensure that the .bss section occupies space up to - _end. Align after .bss to ensure correct alignment even if the - .bss section disappears because there are no input sections. */ - . = ALIGN(64 / 8); - } - .lbss : - { - *(.dynlbss) - *(.lbss .lbss.* .gnu.linkonce.lb.*) - *(LARGE_COMMON) - } - . = ALIGN(64 / 8); - . = SEGMENT_START("ldata-segment", .); - .lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.lrodata .lrodata.* .gnu.linkonce.lr.*) - } - .ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) : - { - *(.ldata .ldata.* .gnu.linkonce.l.*) - . = ALIGN(. != 0 ? 64 / 8 : 1); - } - . = ALIGN(64 / 8); - _end = .; - _bss_end__ = . ; __bss_end__ = . ; __end__ = . ; - PROVIDE (end = .); - . = DATA_SEGMENT_END (.); - /* Stabs debugging sections. */ - .stab 0 : { *(.stab) } - .stabstr 0 : { *(.stabstr) } - .stab.excl 0 : { *(.stab.excl) } - .stab.exclstr 0 : { *(.stab.exclstr) } - .stab.index 0 : { *(.stab.index) } - .stab.indexstr 0 : { *(.stab.indexstr) } - .comment 0 : { *(.comment) } - /* DWARF debug sections. - Symbols in the DWARF debugging sections are relative to the beginning - of the section so we begin them at 0. */ - /* DWARF 1 */ - .debug 0 : { *(.debug) } - .line 0 : { *(.line) } - /* GNU DWARF 1 extensions */ - .debug_srcinfo 0 : { *(.debug_srcinfo) } - .debug_sfnames 0 : { *(.debug_sfnames) } - /* DWARF 1.1 and DWARF 2 */ - .debug_aranges 0 : { *(.debug_aranges) } - .debug_pubnames 0 : { *(.debug_pubnames) } - /* DWARF 2 */ - .debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) } - .debug_abbrev 0 : { *(.debug_abbrev) } - .debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) } - .debug_frame 0 : { *(.debug_frame) } - .debug_str 0 : { *(.debug_str) } - .debug_loc 0 : { *(.debug_loc) } - .debug_macinfo 0 : { *(.debug_macinfo) } - /* SGI/MIPS DWARF 2 extensions */ - .debug_weaknames 0 : { *(.debug_weaknames) } - .debug_funcnames 0 : { *(.debug_funcnames) } - .debug_typenames 0 : { *(.debug_typenames) } - .debug_varnames 0 : { *(.debug_varnames) } - /* DWARF 3 */ - .debug_pubtypes 0 : { *(.debug_pubtypes) } - .debug_ranges 0 : { *(.debug_ranges) } - /* DWARF Extension. */ - .debug_macro 0 : { *(.debug_macro) } - .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } - /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) *(.mdebug.*) } -} diff --git a/x86_64-linux-android/lib/ldscripts/i386linux.x b/x86_64-linux-android/lib/ldscripts/i386linux.x deleted file mode 100644 index 68bc5a0..0000000 --- a/x86_64-linux-android/lib/ldscripts/i386linux.x +++ /dev/null @@ -1,47 +0,0 @@ -/* Default linker script, for normal executables */ -/* Modified for Android. */ -OUTPUT_FORMAT("a.out-i386-linux", "a.out-i386-linux", - "a.out-i386-linux") -OUTPUT_ARCH(i386) -PROVIDE (__stack = 0); -SECTIONS -{ - . = 0x1020; - .text : - { - CREATE_OBJECT_SYMBOLS - *(.text) - /* The next six sections are for SunOS dynamic linking. The order - is important. */ - *(.dynrel) - *(.hash) - *(.dynsym) - *(.dynstr) - *(.rules) - *(.need) - _etext = .; - __etext = .; - } - . = ALIGN(0x1000); - .data : - { - /* The first three sections are for SunOS dynamic linking. */ - *(.dynamic) - *(.got) - *(.plt) - *(.data) - *(.linux-dynamic) /* For Linux dynamic linking. */ - CONSTRUCTORS - _edata = .; - __edata = .; - } - .bss : - { - __bss_start = .; - *(.bss) - *(COMMON) - . = ALIGN(4); - _end = . ; - __end = . ; - } -} diff --git a/x86_64-linux-android/lib/ldscripts/i386linux.xbn b/x86_64-linux-android/lib/ldscripts/i386linux.xbn deleted file mode 100644 index 91b1e91..0000000 --- a/x86_64-linux-android/lib/ldscripts/i386linux.xbn +++ /dev/null @@ -1,46 +0,0 @@ -/* Script for -N: mix text and data on same page; don't align data */ -OUTPUT_FORMAT("a.out-i386-linux", "a.out-i386-linux", - "a.out-i386-linux") -OUTPUT_ARCH(i386) -PROVIDE (__stack = 0); -SECTIONS -{ - . = 0; - .text : - { - CREATE_OBJECT_SYMBOLS - *(.text) - /* The next six sections are for SunOS dynamic linking. The order - is important. */ - *(.dynrel) - *(.hash) - *(.dynsym) - *(.dynstr) - *(.rules) - *(.need) - _etext = .; - __etext = .; - } - . = .; - .data : - { - /* The first three sections are for SunOS dynamic linking. */ - *(.dynamic) - *(.got) - *(.plt) - *(.data) - *(.linux-dynamic) /* For Linux dynamic linking. */ - CONSTRUCTORS - _edata = .; - __edata = .; - } - .bss : - { - __bss_start = .; - *(.bss) - *(COMMON) - . = ALIGN(4); - _end = . ; - __end = . ; - } -} diff --git a/x86_64-linux-android/lib/ldscripts/i386linux.xn b/x86_64-linux-android/lib/ldscripts/i386linux.xn deleted file mode 100644 index 6185656..0000000 --- a/x86_64-linux-android/lib/ldscripts/i386linux.xn +++ /dev/null @@ -1,46 +0,0 @@ -/* Script for -n: mix text and data on same page */ -OUTPUT_FORMAT("a.out-i386-linux", "a.out-i386-linux", - "a.out-i386-linux") -OUTPUT_ARCH(i386) -PROVIDE (__stack = 0); -SECTIONS -{ - . = 0; - .text : - { - CREATE_OBJECT_SYMBOLS - *(.text) - /* The next six sections are for SunOS dynamic linking. The order - is important. */ - *(.dynrel) - *(.hash) - *(.dynsym) - *(.dynstr) - *(.rules) - *(.need) - _etext = .; - __etext = .; - } - . = ALIGN(0x1000); - .data : - { - /* The first three sections are for SunOS dynamic linking. */ - *(.dynamic) - *(.got) - *(.plt) - *(.data) - *(.linux-dynamic) /* For Linux dynamic linking. */ - CONSTRUCTORS - _edata = .; - __edata = .; - } - .bss : - { - __bss_start = .; - *(.bss) - *(COMMON) - . = ALIGN(4); - _end = . ; - __end = . ; - } -} diff --git a/x86_64-linux-android/lib/ldscripts/i386linux.xr b/x86_64-linux-android/lib/ldscripts/i386linux.xr deleted file mode 100644 index 8a33f28..0000000 --- a/x86_64-linux-android/lib/ldscripts/i386linux.xr +++ /dev/null @@ -1,37 +0,0 @@ -/* Script for ld -r: link without relocation */ -OUTPUT_FORMAT("a.out-i386-linux", "a.out-i386-linux", - "a.out-i386-linux") -OUTPUT_ARCH(i386) -SECTIONS -{ - .text : - { - CREATE_OBJECT_SYMBOLS - *(.text) - /* The next six sections are for SunOS dynamic linking. The order - is important. */ - *(.dynrel) - *(.hash) - *(.dynsym) - *(.dynstr) - *(.rules) - *(.need) - } - .data : - { - /* The first three sections are for SunOS dynamic linking. */ - *(.dynamic) - *(.got) - *(.plt) - *(.data) - *(.linux-dynamic) /* For Linux dynamic linking. */ - } - .bss : - { - ; - *(.bss) - *(COMMON) - ; - ; - } -} diff --git a/x86_64-linux-android/lib/ldscripts/i386linux.xu b/x86_64-linux-android/lib/ldscripts/i386linux.xu deleted file mode 100644 index 6847100..0000000 --- a/x86_64-linux-android/lib/ldscripts/i386linux.xu +++ /dev/null @@ -1,38 +0,0 @@ -/* Script for ld -Ur: link w/out relocation, do create constructors */ -OUTPUT_FORMAT("a.out-i386-linux", "a.out-i386-linux", - "a.out-i386-linux") -OUTPUT_ARCH(i386) -SECTIONS -{ - .text : - { - CREATE_OBJECT_SYMBOLS - *(.text) - /* The next six sections are for SunOS dynamic linking. The order - is important. */ - *(.dynrel) - *(.hash) - *(.dynsym) - *(.dynstr) - *(.rules) - *(.need) - } - .data : - { - /* The first three sections are for SunOS dynamic linking. */ - *(.dynamic) - *(.got) - *(.plt) - *(.data) - *(.linux-dynamic) /* For Linux dynamic linking. */ - CONSTRUCTORS - } - .bss : - { - ; - *(.bss) - *(COMMON) - ; - ; - } -} diff --git a/x86_64-linux-android/lib/libatomic.a b/x86_64-linux-android/lib/libatomic.a Binary files differdeleted file mode 100644 index 0167b50..0000000 --- a/x86_64-linux-android/lib/libatomic.a +++ /dev/null diff --git a/x86_64-linux-android/lib/libgomp.a b/x86_64-linux-android/lib/libgomp.a Binary files differdeleted file mode 100644 index d0a1e79..0000000 --- a/x86_64-linux-android/lib/libgomp.a +++ /dev/null diff --git a/x86_64-linux-android/lib/libgomp.spec b/x86_64-linux-android/lib/libgomp.spec deleted file mode 100644 index ec773a8..0000000 --- a/x86_64-linux-android/lib/libgomp.spec +++ /dev/null @@ -1,3 +0,0 @@ -# This spec file is read by gcc when linking. It is used to specify the -# standard libraries we need in order to link with -fopenmp. -*link_gomp: -lgomp diff --git a/x86_64-linux-android/lib64/libatomic.a b/x86_64-linux-android/lib64/libatomic.a Binary files differdeleted file mode 100644 index 58a0a71..0000000 --- a/x86_64-linux-android/lib64/libatomic.a +++ /dev/null diff --git a/x86_64-linux-android/lib64/libgomp.a b/x86_64-linux-android/lib64/libgomp.a Binary files differdeleted file mode 100644 index e36c111..0000000 --- a/x86_64-linux-android/lib64/libgomp.a +++ /dev/null diff --git a/x86_64-linux-android/lib64/libgomp.spec b/x86_64-linux-android/lib64/libgomp.spec deleted file mode 100644 index ec773a8..0000000 --- a/x86_64-linux-android/lib64/libgomp.spec +++ /dev/null @@ -1,3 +0,0 @@ -# This spec file is read by gcc when linking. It is used to specify the -# standard libraries we need in order to link with -fopenmp. -*link_gomp: -lgomp diff --git a/x86_64-linux-android/libx32/libatomic.a b/x86_64-linux-android/libx32/libatomic.a Binary files differdeleted file mode 100644 index a510c74..0000000 --- a/x86_64-linux-android/libx32/libatomic.a +++ /dev/null diff --git a/x86_64-linux-android/libx32/libgomp.a b/x86_64-linux-android/libx32/libgomp.a Binary files differdeleted file mode 100644 index 6303d2a..0000000 --- a/x86_64-linux-android/libx32/libgomp.a +++ /dev/null diff --git a/x86_64-linux-android/libx32/libgomp.spec b/x86_64-linux-android/libx32/libgomp.spec deleted file mode 100644 index ec773a8..0000000 --- a/x86_64-linux-android/libx32/libgomp.spec +++ /dev/null @@ -1,3 +0,0 @@ -# This spec file is read by gcc when linking. It is used to specify the -# standard libraries we need in order to link with -fopenmp. -*link_gomp: -lgomp |