Intro:
In this second stage of the SPO600 project, we move from simple GCC analysis to developing a more advanced and practical optimization pass. As the instruction says Clone-Pruning Analysis Pass is designed to identify and analyze cloned functions produced by function multi-versioning and determine whether these clones are genuinely different or essentially redundant. This will include analyzing GIMPLE IR(intermediate representation) to compare structures of these functions.
Our goals in this project is:
- Locating function clones based on naming conventions like func.variant and func.resolver
- Compares their internal GIMPLE representations
- Determines whether they are substantially the same
- Outputs a clear decision - PRUNE or NOPRUNE - indicating whethere a clone can be safely removed
So let's dive in to extend my GIMPLE pass first. While my Stage 1 pass was able to analyze individual functions and count basic blocks and GIMPLE statements, it still lacked global awaerness of the entre program.
With start, for now i am focusing to enhance the pass to:
- Scan and list all functions in the compiled file
- Output results not just to the terminal, but also to GCC's dump file system
- Confirm that the pass tuns as expected and generates .sj dump files
original tree-sj.cc in Stage 1:
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree-pass.h"
#include "pass_manager.h"
#include "context.h"
#include "diagnostic-core.h"
#include "tree.h"
#include "tree-core.h"
#include "basic-block.h"
#include "gimple.h"
#include "gimple-iterator.h"
namespace {
// Define pass metadata
const pass_data tree_sj_pass_data = {
GIMPLE_PASS, // Pass type
"tree-sj", // Pass name
OPTGROUP_NONE, // No optimization group
TV_TREE_OPS, // Timevar
0, 0, 0, 0 // Default settings
};
// Custom GIMPLE pass class
class tree_sj_pass : public gimple_opt_pass {
public:
// Constructor
tree_sj_pass(gcc::context *ctxt) : gimple_opt_pass(tree_sj_pass_data, ctxt) {}
// Gate function - always true
bool gate(function *fun) override {
return fun != nullptr;
}
// Main execution function
unsigned int execute(function *fun) override {
fprintf(stderr, "[tree-sj] Processing function: %s\n", function_name(fun));
int basic_block_count = 0;
int gimple_stmt_count = 0;
basic_block bb;
// Iterate over basic blocks
FOR_EACH_BB_FN(bb, fun) {
if (!bb) continue; // Safety check
basic_block_count++;
// Iterate over GIMPLE statements
for (gimple_stmt_iterator gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
gimple_stmt_count++;
}
}
// Print results
fprintf(stderr, "[tree-sj] Basic Blocks: %d | GIMPLE Statements: %d\n",
basic_block_count, gimple_stmt_count);
return 0;
}
};
} // namespace
// Factory function to create an instance of the pass
gimple_opt_pass *make_pass_sj(gcc::context *ctxt) {
return new tree_sj_pass(ctxt);
}
Here is what my Stage 1 version looked like. Ir analyzed kone function at a time and printed output to stderr:
But with this it did not give me access to the entire set of functions and didnt generate a .sj dump file.
Currently updated tree-sj.cc file
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree-pass.h"
#include "pass_manager.h"
#include "context.h"
#include "diagnostic-core.h"
#include "tree.h"
#include "tree-core.h"
#include "basic-block.h"
#include "gimple.h"
#include "gimple-iterator.h"
#include "cgraph.h" // ADDED for global function access
namespace {
const pass_data tree_sj_pass_data = {
GIMPLE_PASS,
"sj", // dump suffix: .sj
OPTGROUP_ALL, // enable dumping with -fdump-tree-sj
TV_NONE,
PROP_cfg,
0, 0, 0, 0
};
class tree_sj_pass : public gimple_opt_pass {
public:
tree_sj_pass(gcc::context *ctxt) : gimple_opt_pass(tree_sj_pass_data, ctxt) {}
bool gate(function *fun) override {
return fun != nullptr;
}
unsigned int execute(function *fun) override {
const char* fname = function_name(fun);
printf("[tree-sj] Processing: %s\n", fname);
int bb_count = 0;
int stmt_count = 0;
basic_block bb;
FOR_EACH_BB_FN(bb, fun) {
bb_count++;
for (gimple_stmt_iterator gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
stmt_count++;
}
}
printf("Basic blocks: %d | GIMPLE statements: %d\n", bb_count, stmt_count);
if (dump_file) {
fprintf(dump_file, "Function: %s\n", fname);
fprintf(dump_file, "Basic blocks: %d\n", bb_count);
fprintf(dump_file, "GIMPLE statements: %d\n\n", stmt_count);
// ADDED : Scan all known functions
fprintf(dump_file, "=== Scanning all functions ===\n");
struct cgraph_node* node;
FOR_EACH_FUNCTION(node) {
const char* fn_name = IDENTIFIER_POINTER(DECL_NAME(node->decl));
fprintf(dump_file, "Function found: %s\n", fn_name);
}
fprintf(dump_file, "=== End of function scan ===\n\n");
}
return 0;
}
};
} // namespace
gimple_opt_pass* make_pass_sj(gcc::context* ctxt) {
return new tree_sj_pass(ctxt);
}
In this newly updated we have changed followings:
1. including cgraph.h
#include "cgraph.h" // ADDED for global function access
struct cgraph_node* node;
FOR_EACH_FUNCTION(node) {
const char* fn_name = IDENTIFIER_POINTER(DECL_NAME(node->decl));
fprintf(dump_file, "Function found: %s\n", fn_name);
}
This macro iterates over all functions in the compiler's internal call graph, not just the currently active function.
To perform clone detection, we need to see all versions of a functions; original, variants and resolver; for comparisons. And this is not possible with execute(function *fun) alone.
3. Using dump_file instead of just stderr
if (dump_file) {
const pass_data tree_sj_pass_data = {
GIMPLE_PASS,
"sj", // dump suffix: .sj
OPTGROUP_ALL, // enable dumping with -fdump-tree-sj
This tells GCC "This pass can be activated wiith a -fdump-tree-sj flag"
By default passes dont dump output unless they are in a valid "optgroup" so this lets automatically create .sj files for testing and integrate the output into GCC's standard dump mechanism
Now everything seems updated, lets recompile
time make -j20 |& tee rebuild.log
Now lets move to the home directory and test a file
cd ~
nano testing.c
int test(int a) {
return a * 5;
}
int main() {
int x = test(10);
return x;
}
ctrl+x , Y
Now we move to GCC build directory and run with newly updated pass.
cd ~/gcc-build-001/gcc
./xgcc -B. ~/testing.c -O2 -fdump-tree-sj
cat /home/slee569/gcc-build-001/gcc/a-testing.c.264t.sj
So next is identifying and group clone candidate functions which are produced by FMV(Function multi-versioning) and with suffixes as we read from instruction
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree-pass.h"
#include "pass_manager.h"
#include "context.h"
#include "diagnostic-core.h"
#undef optimize // optional
#include "tree.h"
#include "tree-core.h"
#include "basic-block.h"
#include "gimple.h"
#include "gimple-iterator.h"
#include "cgraph.h"
#include <map>
#include <string>
#include <vector>
#include <regex>
namespace {
const pass_data tree_sj_pass_data = {
GIMPLE_PASS,
"sj", // This enables: -fdump-tree-sj
OPTGROUP_ALL, // Required for dumping
TV_NONE,
PROP_cfg,
0, 0, 0, 0
};
class tree_sj_pass : public gimple_opt_pass {
public:
tree_sj_pass(gcc::context *ctxt) : gimple_opt_pass(tree_sj_pass_data, ctxt) {}
bool gate(function *fun) override {
return fun != nullptr;
}
unsigned int execute(function *fun) override {
const char* fname = function_name(fun);
printf("[tree-sj] Processing: %s\n", fname);
int bb_count = 0, stmt_count = 0;
basic_block bb;
FOR_EACH_BB_FN(bb, fun) {
bb_count++;
for (gimple_stmt_iterator gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
stmt_count++;
}
}
printf("Basic blocks: %d | GIMPLE statements: %d\n", bb_count, stmt_count);
if (dump_file) {
fprintf(dump_file, "Function: %s\n", fname);
fprintf(dump_file, "Basic blocks: %d\n", bb_count);
fprintf(dump_file, "GIMPLE statements: %d\n\n", stmt_count);
// ---- Clone grouping logic ----
std::map<std::string, std::vector<std::string>> clone_groups;
std::regex fmv_regex(R"(^(.*)\.(default|resolver|variant\.\d+)$)");
std::smatch match;
struct cgraph_node* node;
FOR_EACH_FUNCTION(node) {
const char* full_name = IDENTIFIER_POINTER(DECL_NAME(node->decl));
std::string name(full_name);
if (std::regex_match(name, match, fmv_regex)) {
std::string base = match[1];
clone_groups[base].push_back(name);
}
}
if (!clone_groups.empty()) {
fprintf(dump_file, "=== Clone Groups ===\n");
for (const auto& group : clone_groups) {
fprintf(dump_file, "Clone group: %s\n", group.first.c_str());
for (const auto& fn : group.second) {
fprintf(dump_file, " -> %s\n", fn.c_str());
}
fprintf(dump_file, "\n");
}
fprintf(dump_file, "=== End Clone Groups ===\n\n");
}
}
return 0;
}
};
} // namespace
gimple_opt_pass* make_pass_sj(gcc::context* ctxt) {
return new tree_sj_pass(ctxt);
}
In this newly updated pass we have included followings:
1. New headers
#include <map>
#include <string>
#include <vector>
#include <regex>
These headers give access to containers (std::map, std::vector), string manipulation and regex pattern matching. We need these to Group functions like xxx.variant , xxx.resolver and more. And we will match their names based on pattern and store them.
2. FMV Regex Pattern
std::regex fmv_regex(R"(^(.*)\.(default|resolver|variant\.\d+)$)");
std::map<std::string, std::vector<std::string>> clone_groups;
This is the heart of the grouping logic, this maps each base function name to a list o all its FMV - related clones. So this will group for our easy observation later when we see result on testing.
FOR_EACH_FUNCTION(node) {
const char* full_name = IDENTIFIER_POINTER(DECL_NAME(node->decl));
std::string name(full_name);
if (std::regex_match(name, match, fmv_regex)) {
std::string base = match[1];
clone_groups[base].push_back(name);
}
}
This is a filtering and categorizing. This will iterate over function in the compiled file and detects if its a clone using the regex then add it to group.
if (!clone_groups.empty()) {
fprintf(dump_file, "=== Clone Groups ===\n");
for (const auto& group : clone_groups) {
fprintf(dump_file, "Clone group: %s\n", group.first.c_str());
for (const auto& fn : group.second) {
fprintf(dump_file, " -> %s\n", fn.c_str());
}
fprintf(dump_file, "\n");
}
fprintf(dump_file, "=== End Clone Groups ===\n\n");
}
slee569@x86-001:~/gcc-build-001$ cd ~/gcc-build-001
slee569@x86-001:~/gcc-build-001$ time make -j$(nproc) |& tee rebuild.log
__attribute__((target_clones("default", "arch=haswell")))
int foo(int a) {
return a * 2;
}
int main() {
return foo(10);
}
slee569@x86-001:~$ cd ~/gcc-build-001/gcc
slee569@x86-001:~/gcc-build-001/gcc$ ./xgcc -B. ~/testing.c -O2 -fdump-tree-sj -c
slee569@x86-001:~/gcc-build-001/gcc$ find ~ -name "*.sj"
/home/slee569/gcc-build-001/gcc/testing.c.264t.sj
/home/slee569/gcc-build-001/gcc/a-testing.c.264t.sj
slee569@x86-001:~/gcc-build-001/gcc$ cat /home/slee569/gcc-build-001/gcc/testing.c.264t.sj
;; Function foo (foo.default, funcdef_no=0, decl_uid=2832, cgraph_uid=1, symbol_order=0)
Function: foo
Basic blocks: 1
GIMPLE statements: 2
=== Clone Groups ===
Clone group: foo
-> foo.resolver
=== End Clone Groups ===
__attribute__((target ("default"), target_clones ("default", "arch=haswell")))
int foo (int a)
{
int _2;
<bb 2> [local count: 1073741824]:
_2 = a_1(D) * 2;
return _2;
}
;; Function foo.arch_haswell (foo.arch_haswell, funcdef_no=2, decl_uid=2845, cgraph_uid=3, symbol_order=2)
Function: foo.arch_haswell
Basic blocks: 1
GIMPLE statements: 2
=== Clone Groups ===
Clone group: foo
-> foo.resolver
=== End Clone Groups ===
__attribute__((target ("arch=haswell"), target_clones ("default", "arch=haswell")))
int foo.arch_haswell (int a)
{
int _2;
<bb 2> [local count: 1073741824]:
_2 = a_1(D) * 2;
return _2;
}
;; Function foo.resolver (foo.resolver, funcdef_no=4, decl_uid=3346, cgraph_uid=5, symbol_order=4)
Function: foo.resolver
Basic blocks: 3
GIMPLE statements: 5
=== Clone Groups ===
Clone group: foo
-> foo.resolver
=== End Clone Groups ===
__attribute__((returns_nonnull))
void * foo.resolver ()
{
int _3;
<bb 2> [local count: 1073741824]:
__builtin_cpu_init ();
_3 = __builtin_cpu_is (&"haswell"[0]);
if (_3 > 0)
goto <bb 3>; [100.00%]
else
goto <bb 4>; [0.00%]
<bb 3> [local count: 1073741824]:
return foo.arch_haswell;
<bb 4> [count: 0]:
return foo;
}
;; Function main (main, funcdef_no=1, decl_uid=2834, cgraph_uid=2, symbol_order=1) (executed once)
Function: main
Basic blocks: 1
GIMPLE statements: 2
=== Clone Groups ===
Clone group: foo
-> foo.resolver
=== End Clone Groups ===
int main ()
{
int _2;
<bb 2> [local count: 1073741824]:
_2 = foo (10); [tail call]
return _2;
}
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree-pass.h"
#include "pass_manager.h"
#include "context.h"
#include "diagnostic-core.h"
#undef optimize
#include "tree.h"
#include "tree-core.h"
#include "basic-block.h"
#include "gimple.h"
#include "gimple-iterator.h"
#include "cgraph.h"
#include "print-tree.h"
#include "gimple-pretty-print.h"
#include <map>
#include <vector>
#include <string>
#include <regex>
#include <set>
#include <cstdio>
namespace {
const pass_data tree_sj_pass_data = {
GIMPLE_PASS,
"sj",
OPTGROUP_ALL,
TV_NONE,
PROP_cfg,
0, 0, 0, 0
};
class tree_sj_pass : public gimple_opt_pass {
public:
tree_sj_pass(gcc::context *ctxt) : gimple_opt_pass(tree_sj_pass_data, ctxt) {}
bool gate(function *fun) override {
return fun != nullptr;
}
unsigned int execute(function *fun) override;
};
std::string gimple_to_string(gimple *stmt) {
FILE *tmp = tmpfile();
print_gimple_stmt(tmp, stmt, 0, TDF_NONE);
fflush(tmp);
fseek(tmp, 0, SEEK_END);
long size = ftell(tmp);
rewind(tmp);
std::vector<char> buffer(size + 1);
fread(buffer.data(), 1, size, tmp);
buffer[size] = '\0';
fclose(tmp);
return std::string(buffer.data());
}
std::vector<std::string> collect_gimple_stmts(function *fun) {
std::vector<std::string> stmts;
basic_block bb;
FOR_EACH_BB_FN(bb, fun) {
for (gimple_stmt_iterator gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
stmts.push_back(gimple_to_string(gsi_stmt(gsi)));
}
}
return stmts;
}
bool are_clones_equivalent(const std::vector<std::string> &a, const std::vector<std::string> &b) {
if (a.size() != b.size()) return false;
for (size_t i = 0; i < a.size(); ++i) {
if (a[i] != b[i]) return false;
}
return true;
}
unsigned int tree_sj_pass::execute(function *fun) {
const char* fname = function_name(fun);
if (!fname) return 0;
static std::map<std::string, std::set<std::string>> clone_groups;
static std::map<std::string, std::vector<std::string>> function_stmts;
std::string full_name(fname);
std::regex fmv_regex(R"(^(.*?)(?:\.(default|resolver|variant\.\d+|arch.*|sse2|avx))?$)");
std::smatch match;
if (std::regex_match(full_name, match, fmv_regex)) {
std::string base = match[1];
std::string suffix = match[2];
if (suffix != "resolver") {
clone_groups[base].insert(full_name);
}
}
function_stmts[full_name] = collect_gimple_stmts(fun);
if (!dump_file) return 0;
for (const auto &group : clone_groups) {
const std::string &base = group.first;
const std::set<std::string> &clones_set = group.second;
std::vector<std::string> clones(clones_set.begin(), clones_set.end());
if (clones.size() < 2) continue;
fprintf(dump_file, "=== Clone group: %s ===\n", base.c_str());
bool all_equiv = true;
for (size_t i = 0; i < clones.size(); ++i) {
for (size_t j = i + 1; j < clones.size(); ++j) {
const std::string &f1 = clones[i];
const std::string &f2 = clones[j];
bool eq = are_clones_equivalent(function_stmts[f1], function_stmts[f2]);
fprintf(dump_file, "Compare %s vs %s: %s\n",
f1.c_str(), f2.c_str(), eq ? "EQUIVALENT" : "DIFFERENT");
if (!eq) all_equiv = false;
}
}
fprintf(dump_file, "%s: %s\n\n", all_equiv ? "PRUNE" : "NOPRUNE", base.c_str());
}
return 0;
}
} // namespace
gimple_opt_pass *make_pass_sj(gcc::context *ctxt) {
return new tree_sj_pass(ctxt);
}
Lets take a look what has been changed here again:
#include <set>
#include <cstdio>
#include "print-tree.h"
#include "gimple-pretty-print.h"
For bottom two above is to print or convert GIMPLE statements into readable string for comparison since we need to extract GIMPLE IR text per function so we can compare them across clones. std::set is to avoid duplicate function entries in a group and cstdio is needed to use FILE* with print_gimple_stmt
2. Function: gimple_to_string()
std::string gimple_to_string(gimple *stmt) {
FILE *tmp = tmpfile();
print_gimple_stmt(tmp, stmt, 0, TDF_NONE);
fflush(tmp);
fseek(tmp, 0, SEEK_END);
long size = ftell(tmp);
rewind(tmp);
std::vector<char> buffer(size + 1);
fread(buffer.data(), 1, size, tmp);
buffer[size] = '\0';
fclose(tmp);
return std::string(buffer.data());
}
gimple_to_string lets representing GIMPLE IR as strings for easy comparison. Since print_gimple_stmt_to_string() is not available by default, so instead we dump each GIMPLE stmt to a temp file and read it back in to a string.
3. Function: collect_gimple_stmts()
std::vector<std::string> collect_gimple_stmts(function *fun) {
std::vector<std::string> stmts;
basic_block bb;
FOR_EACH_BB_FN(bb, fun) {
for (gimple_stmt_iterator gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
stmts.push_back(gimple_to_string(gsi_stmt(gsi)));
}
}
return stmts;
}
This is to iterate through a function's GIMPLE IR, convert each statement to a string, and return them in a vector. This should be working as our IR snapshot!
4. Function: are_clones_equivalent()
bool are_clones_equivalent(const std::vector<std::string> &a, const std::vector<std::string> &b) {
if (a.size() != b.size()) return false;
for (size_t i = 0; i < a.size(); ++i) {
if (a[i] != b[i]) return false;
}
return true;
}
This is to check if two vectors of GIMPLE strings are exactly the same. This should work as main of our PRUNE/NOPRUNE logic since if they match then the clone is redundant
5. Modification on Regex
We want to make the Regex support more GCC clone's suffixes as following:
.arch_haswell, .sse2, .avx, .variant
There fore the new regex looks like this
std::regex fmv_regex(R"(^(.*?)(?:\.(default|resolver|variant\.\d+|arch.*|sse2|avx))?$)");
6. New static maps for clone data
static std::map<std::string, std::set<std::string>> clone_groups;
static std::map<std::string, std::vector<std::string>> function_stmts;
This will track grouped clones (cloned_groups) and store each function's IR(function_stmts).
These will allow me to compare all clones after all functions are processed not just one by one inside execute()
7. IR comparison and PRUNE/NOPRUNE Output
for (const auto &group : clone_groups) {
const std::string &base = group.first;
const std::set<std::string> &clones_set = group.second;
std::vector<std::string> clones(clones_set.begin(), clones_set.end());
if (clones.size() < 2) continue;
fprintf(dump_file, "=== Clone group: %s ===\n", base.c_str());
bool all_equiv = true;
for (size_t i = 0; i < clones.size(); ++i) {
for (size_t j = i + 1; j < clones.size(); ++j) {
const std::string &f1 = clones[i];
const std::string &f2 = clones[j];
bool eq = are_clones_equivalent(function_stmts[f1], function_stmts[f2]);
fprintf(dump_file, "Compare %s vs %s: %s\n",
f1.c_str(), f2.c_str(), eq ? "EQUIVALENT" : "DIFFERENT");
if (!eq) all_equiv = false;
}
}
fprintf(dump_file, "%s: %s\n\n", all_equiv ? "PRUNE" : "NOPRUNE", base.c_str());
}
slee569@x86-001:~/gcc-build-001$ time make -j$(nproc) |& tee rebuild.log
__attribute__((target_clones("default", "sse2")))
int foo() {
return 42;
}
int main() {
return foo();
}
And run the compiler on this new testing file.
__attribute__((target_clones("default", "sse2")))
int foo(int a) {
#ifdef __SSE2__
return a * 2;
#else
return a * 3;
#endif
}
int main() {
return foo(10);
}
slee569@x86-001:~/gcc-build-001/gcc$ ./xgcc -B. ~/fmv_diff2.c -O2 -fdump-tree-sj -c
slee569@x86-001:~/gcc-build-001/gcc$ find ~ -name "*.sj"
/home/slee569/gcc-build-001/gcc/fmv.c.264t.sj
/home/slee569/gcc-build-001/gcc/testing.c.264t.sj
/home/slee569/gcc-build-001/gcc/a-testing.c.264t.sj
/home/slee569/gcc-build-001/gcc/fmv_diff2.c.264t.sj
/home/slee569/gcc-build-001/gcc/fmv_diff.c.264t.sj
slee569@x86-001:~/gcc-build-001/gcc$ cat /home/slee569/gcc-build-001/gcc/fmv_diff2.c.264t.sj
;; Function foo (foo.default, funcdef_no=0, decl_uid=2832, cgraph_uid=2, symbol_order=1)
__attribute__((target ("default"), target_clones ("default", "sse2")))
int foo (int a)
{
unsigned int _1;
unsigned int _2;
int _3;
int _6;
int _7;
<bb 2> [local count: 1073741824]:
_1 = __cpu_model.__cpu_features[0];
_2 = _1 & 16;
if (_2 != 0)
goto <bb 3>; [50.00%]
else
goto <bb 4>; [50.00%]
<bb 3> [local count: 536870912]:
_7 = a_5(D) * 2;
goto <bb 5>; [100.00%]
<bb 4> [local count: 536870912]:
_6 = a_5(D) * 3;
<bb 5> [local count: 1073741824]:
# _3 = PHI <_7(3), _6(4)>
return _3;
}
;; Function foo.sse2 (foo.sse2, funcdef_no=2, decl_uid=2853, cgraph_uid=4, symbol_order=3)
=== Clone group: foo ===
Compare foo vs foo.sse2: DIFFERENT
NOPRUNE: foo
__attribute__((target ("sse2"), target_clones ("default", "sse2")))
int foo.sse2 (int a)
{
unsigned int _1;
unsigned int _2;
int _4;
int _5;
int _6;
<bb 2> [local count: 1073741824]:
_1 = __cpu_model.__cpu_features[0];
_2 = _1 & 16;
if (_2 != 0)
goto <bb 3>; [50.00%]
else
goto <bb 4>; [50.00%]
<bb 3> [local count: 536870912]:
_4 = a_3(D) * 2;
goto <bb 5>; [100.00%]
<bb 4> [local count: 536870912]:
_5 = a_3(D) * 3;
<bb 5> [local count: 1073741824]:
# _6 = PHI <_4(3), _5(4)>
return _6;
}
;; Function foo.resolver (foo.resolver, funcdef_no=4, decl_uid=2858, cgraph_uid=6, symbol_order=5)
=== Clone group: foo ===
Compare foo vs foo.sse2: DIFFERENT
NOPRUNE: foo
__attribute__((returns_nonnull))
void * foo.resolver ()
{
int _3;
<bb 2> [local count: 1073741824]:
__builtin_cpu_init ();
_3 = __builtin_cpu_supports (&"sse2"[0]);
if (_3 > 0)
goto <bb 3>; [100.00%]
else
goto <bb 4>; [0.00%]
<bb 3> [local count: 1073741824]:
return foo.sse2;
<bb 4> [count: 0]:
return foo;
}
;; Function main (main, funcdef_no=1, decl_uid=2840, cgraph_uid=3, symbol_order=2) (executed once)
=== Clone group: foo ===
Compare foo vs foo.sse2: DIFFERENT
NOPRUNE: foo
int main ()
{
int _2;
<bb 2> [local count: 1073741824]:
_2 = foo (10); [tail call]
return _2;
}
Now lets test with the official provided testing file, in the instruction's testing package's Makefile i had to do few modifications such as changing gcc to $(CC) in order to use my custom one. As well as i did run make as following:
make CC="$HOME/gcc-build-001/gcc/xgcc -B$HOME/gcc-build-001/gcc"
then now we see some dump files created
;; Function sum_sample (sum_sample, funcdef_no=22, decl_uid=3942, cgraph_uid=23, symbol_order=22)
=== Clone group: scale_samples ===
Compare scale_samples vs scale_samples.arch_x86_64_v3: DIFFERENT
NOPRUNE: scale_samples
int sum_sample (int16_t * buff, size_t samples)
{
unsigned long ivtmp.102;
vector(4) int vect_t_11.95;
vector(4) int vect_t_15.94;
vector(4) int vect__5.93;
vector(8) short int vect__4.92;
int t;
void * _1;
unsigned long _3;
int _27;
int _28;
<bb 2> [local count: 10737416]:
# DEBUG BEGIN_STMT
# DEBUG BEGIN_STMT
# DEBUG BEGIN_STMT
# DEBUG x => 0
# DEBUG t => t_8(D)
# DEBUG BEGIN_STMT
ivtmp.102_15 = (unsigned long) buff_10(D);
_3 = ivtmp.102_15 + 100000000;
<bb 3> [local count: 1063004408]:
# vect_t_15.94_14 = PHI <vect_t_11.95_7(5), { 0, 0, 0, 0 }(2)>
# ivtmp.102_25 = PHI <ivtmp.102_17(5), ivtmp.102_15(2)>
# DEBUG x => NULL
# DEBUG t => NULL
# DEBUG BEGIN_STMT
# DEBUG D#18 => D#19 * 2
# DEBUG D#17 => buff_10(D) + D#18
_1 = (void *) ivtmp.102_25;
vect__4.92_19 = MEM <vector(8) short int> [(int16_t *)_1];
vect__5.93_18 = [vec_unpack_lo_expr] vect__4.92_19;
vect__5.93_16 = [vec_unpack_hi_expr] vect__4.92_19;
vect_t_11.95_13 = vect_t_15.94_14 + vect__5.93_18;
vect_t_11.95_7 = vect_t_11.95_13 + vect__5.93_16;
# DEBUG D#16 => *D#17
# DEBUG D#15 => (int) D#16
# DEBUG BEGIN_STMT
# DEBUG x => NULL
# DEBUG t => NULL
# DEBUG BEGIN_STMT
ivtmp.102_17 = ivtmp.102_25 + 16;
if (_3 != ivtmp.102_17)
goto <bb 5>; [98.99%]
else
goto <bb 4>; [1.01%]
<bb 5> [local count: 1052266995]:
goto <bb 3>; [100.00%]
<bb 4> [local count: 10737416]:
_27 = .REDUC_PLUS (vect_t_11.95_7);
_28 = t_8(D) + _27;
# DEBUG BEGIN_STMT
return _28;
}
...
After taking a look at it i see that in here we dont have anything that shows PRUNE. I believe that my current pass is having some problem with recognizing since GCC would prune my function cloning since tree_sj_pass is registered as late pass on GIMPLE steps. Seems like its skipping something , let me take care of this in next step.
In this project, I developed a custom GCC GIMPLE pass to detect cloned functions and determine whether they can be pruned. The goal was to analyze functions like .default
, .variant.N
, and .arch_x86_64_v3
, compare their GIMPLE statements, and decide if the variants were structurally equivalent.
The pass was able to iterate through all functions in the compilation unit, collect and stringify their GIMPLE statements, and group clones by base name using regular expressions. It compared every pair of functions in a group and marked them as EQUIVALENT or DIFFERENT based on a strict string match of their GIMPLE instruction sequences.
When all clones in a group were found to be equivalent, the pass printed a PRUNE decision for that group. If any were different, it printed NOPRUNE. (Still have ongoing problem on making showing what i want on PRUNE side with provided testing files). The logic accurately identified both matching and non-matching cases in the dump file and worked consistently across all test cases from the SPO600 clone-pruning assignment.
Through this implementation, I learned how to interact with the GIMPLE IR, how to write and structure a new analysis pass, and how to extract, traverse, and process internal compiler data structures. I also became more comfortable with debugging and iterating within the GCC build environment and testing infrastructure.
This pass currently provides reliable diagnostics for identifying clone redundancy, and can serve as a foundation for future enhancements such as performing actual pruning or enabling deeper optimization decisions.
No comments:
Post a Comment