• R/O
  • HTTP
  • SSH
  • HTTPS

Commit

Tags
Keine Tags

Frequently used words (click to add to your profile)

javac++androidlinuxc#windowsobjective-ccocoa誰得qtpythonphprubygameguibathyscaphec計画中(planning stage)翻訳omegatframeworktwitterdomtestvb.netdirectxゲームエンジンbtronarduinopreviewer

GNU Binutils with patches for OS216


Commit MetaInfo

Revision148d6384291720bcaaa062badf1179b6215f6da3 (tree)
Zeit2017-11-28 08:15:46
AutorMax Filippov <jcmvbkbc@gmai...>
CommiterMax Filippov

Log Message

gas: xtensa: implement trampoline coalescing

There is a recurring pattern in assembly files generated by a compiler
where a lot of jumps in a function are going to the same place. When
these jumps are relaxed with trampolines the assembler generates a
separate jump thread from each source.
Create an index of trampoline jump targets for each segment and see if a
jump being relaxed goes to a location from that index, in which case
replace its target with a location of existing trampoline jump that
results in the shortest path to the original target.

gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>

* config/tc-xtensa.c (trampoline_chain_entry, trampoline_chain)
(trampoline_chain_index): New structures.
(trampoline_index): Add chain_index field.
(xg_order_trampoline_chain_entry, xg_sort_trampoline_chain)
(xg_find_chain_entry, xg_get_best_chain_entry)
(xg_order_trampoline_chain, xg_get_trampoline_chain)
(xg_find_best_eq_target, xg_add_location_to_chain)
(xg_create_trampoline_chain, xg_get_single_symbol_slot): New
functions.
(xg_relax_fixups): Call xg_find_best_eq_target to adjust jump
target to point to an existing jump. Call
xg_create_trampoline_chain to create new jump target. Call
xg_add_location_to_chain to add newly created trampoline jump
to the corresponding chain.
(add_jump_to_trampoline): Extract loop searching for a single
slot with a symbol into a separate function, replace that code
with a call to that function.
(relax_frag_immed): Call xg_find_best_eq_target to adjust jump
target to point to an existing jump.
* testsuite/gas/xtensa/all.exp: Add trampoline-2 test.
* testsuite/gas/xtensa/trampoline.d: Adjust absolute addresses
as many duplicate trampoline chains are now coalesced.
* testsuite/gas/xtensa/trampoline.s: Add _nop so that objdump
stays in sync with instruction stream.
* testsuite/gas/xtensa/trampoline-2.l: New test result file.
* testsuite/gas/xtensa/trampoline-2.s: New test source file.

Ändern Zusammenfassung

Diff

--- a/gas/ChangeLog
+++ b/gas/ChangeLog
@@ -1,5 +1,34 @@
11 2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
22
3+ * config/tc-xtensa.c (trampoline_chain_entry, trampoline_chain)
4+ (trampoline_chain_index): New structures.
5+ (trampoline_index): Add chain_index field.
6+ (xg_order_trampoline_chain_entry, xg_sort_trampoline_chain)
7+ (xg_find_chain_entry, xg_get_best_chain_entry)
8+ (xg_order_trampoline_chain, xg_get_trampoline_chain)
9+ (xg_find_best_eq_target, xg_add_location_to_chain)
10+ (xg_create_trampoline_chain, xg_get_single_symbol_slot): New
11+ functions.
12+ (xg_relax_fixups): Call xg_find_best_eq_target to adjust jump
13+ target to point to an existing jump. Call
14+ xg_create_trampoline_chain to create new jump target. Call
15+ xg_add_location_to_chain to add newly created trampoline jump
16+ to the corresponding chain.
17+ (add_jump_to_trampoline): Extract loop searching for a single
18+ slot with a symbol into a separate function, replace that code
19+ with a call to that function.
20+ (relax_frag_immed): Call xg_find_best_eq_target to adjust jump
21+ target to point to an existing jump.
22+ * testsuite/gas/xtensa/all.exp: Add trampoline-2 test.
23+ * testsuite/gas/xtensa/trampoline.d: Adjust absolute addresses
24+ as many duplicate trampoline chains are now coalesced.
25+ * testsuite/gas/xtensa/trampoline.s: Add _nop so that objdump
26+ stays in sync with instruction stream.
27+ * testsuite/gas/xtensa/trampoline-2.l: New test result file.
28+ * testsuite/gas/xtensa/trampoline-2.s: New test source file.
29+
30+2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
31+
332 * config/tc-xtensa.c (search_trampolines, get_best_trampoline):
433 Remove definitions.
534 (xg_find_best_trampoline_for_tinsn): New function.
--- a/gas/config/tc-xtensa.c
+++ b/gas/config/tc-xtensa.c
@@ -7348,6 +7348,33 @@ xtensa_end (void)
73487348 xtensa_check_frag_count ();
73497349 }
73507350
7351+struct trampoline_chain_entry
7352+{
7353+ symbolS *sym;
7354+ addressT offset;
7355+};
7356+
7357+/* Trampoline chain for a given (sym, offset) pair is a sorted array
7358+ of locations of trampoline jumps leading there. Jumps are represented
7359+ as pairs (sym, offset): trampoline frag symbol and offset of the jump
7360+ inside the frag. */
7361+struct trampoline_chain
7362+{
7363+ struct trampoline_chain_entry target;
7364+ struct trampoline_chain_entry *entry;
7365+ size_t n_entries;
7366+ size_t n_max;
7367+ bfd_boolean needs_sorting;
7368+};
7369+
7370+struct trampoline_chain_index
7371+{
7372+ struct trampoline_chain *entry;
7373+ size_t n_entries;
7374+ size_t n_max;
7375+ bfd_boolean needs_sorting;
7376+};
7377+
73517378 struct trampoline_index
73527379 {
73537380 fragS **entry;
@@ -7359,7 +7386,10 @@ struct trampoline_seg
73597386 {
73607387 struct trampoline_seg *next;
73617388 asection *seg;
7389+ /* Trampolines ordered by their frag fr_address */
73627390 struct trampoline_index index;
7391+ /* Known trampoline chains ordered by (sym, offset) pair */
7392+ struct trampoline_chain_index chain_index;
73637393 };
73647394
73657395 static struct trampoline_seg trampoline_seg_list;
@@ -7520,6 +7550,187 @@ static bfd_boolean xg_is_trampoline_frag_full (const fragS *fragP)
75207550 return fragP->fr_var < 3;
75217551 }
75227552
7553+static int xg_order_trampoline_chain_entry (const void *a, const void *b)
7554+{
7555+ const struct trampoline_chain_entry *pa = a;
7556+ const struct trampoline_chain_entry *pb = b;
7557+
7558+ if (pa->sym == pb->sym ||
7559+ S_GET_VALUE (pa->sym) == S_GET_VALUE (pb->sym))
7560+ if (pa->offset == pb->offset)
7561+ return 0;
7562+ else
7563+ return pa->offset < pb->offset ? -1 : 1;
7564+ else
7565+ return S_GET_VALUE (pa->sym) < S_GET_VALUE (pb->sym) ? -1 : 1;
7566+}
7567+
7568+static void xg_sort_trampoline_chain (struct trampoline_chain *tc)
7569+{
7570+ qsort (tc->entry, tc->n_entries, sizeof (*tc->entry),
7571+ xg_order_trampoline_chain_entry);
7572+ tc->needs_sorting = FALSE;
7573+}
7574+
7575+/* Find entry index in the given chain with maximal address <= source. */
7576+static size_t xg_find_chain_entry (struct trampoline_chain *tc,
7577+ addressT source)
7578+{
7579+ size_t a = 0;
7580+ size_t b = tc->n_entries;
7581+
7582+ if (tc->needs_sorting)
7583+ xg_sort_trampoline_chain (tc);
7584+
7585+ while (b - a > 1)
7586+ {
7587+ size_t c = (a + b) / 2;
7588+ struct trampoline_chain_entry *e = tc->entry + c;
7589+
7590+ if (S_GET_VALUE(e->sym) + e->offset <= source)
7591+ a = c;
7592+ else
7593+ b = c;
7594+ }
7595+ return a;
7596+}
7597+
7598+/* Find the best jump target for the source in the given trampoline chain.
7599+ The best jump target is the one that results in the shortest path to the
7600+ final target, it's the location of the jump closest to the final target,
7601+ but within the J_RANGE - J_MARGIN from the source. */
7602+static struct trampoline_chain_entry *
7603+xg_get_best_chain_entry (struct trampoline_chain *tc, addressT source)
7604+{
7605+ addressT target = S_GET_VALUE(tc->target.sym) + tc->target.offset;
7606+ size_t i = xg_find_chain_entry (tc, source);
7607+ struct trampoline_chain_entry *e = tc->entry + i;
7608+ int step = target < source ? -1 : 1;
7609+ addressT chained_target;
7610+ offsetT off;
7611+
7612+ if (target > source &&
7613+ S_GET_VALUE(e->sym) + e->offset <= source &&
7614+ i + 1 < tc->n_entries)
7615+ ++i;
7616+
7617+ while (i + step < tc->n_entries)
7618+ {
7619+ struct trampoline_chain_entry *next = tc->entry + i + step;
7620+
7621+ chained_target = S_GET_VALUE(next->sym) + next->offset;
7622+ off = source - chained_target;
7623+
7624+ if (labs (off) >= J_RANGE - J_MARGIN)
7625+ break;
7626+
7627+ i += step;
7628+ }
7629+
7630+ e = tc->entry + i;
7631+ chained_target = S_GET_VALUE(e->sym) + e->offset;
7632+ off = source - chained_target;
7633+
7634+ if (labs (off) < J_MARGIN ||
7635+ labs (off) >= J_RANGE - J_MARGIN)
7636+ return &tc->target;
7637+ return tc->entry + i;
7638+}
7639+
7640+static int xg_order_trampoline_chain (const void *a, const void *b)
7641+{
7642+ const struct trampoline_chain *pa = a;
7643+ const struct trampoline_chain *pb = b;
7644+
7645+ return xg_order_trampoline_chain_entry (&pa->target, &pb->target);
7646+}
7647+
7648+static struct trampoline_chain *
7649+xg_get_trampoline_chain (struct trampoline_seg *ts,
7650+ symbolS *sym,
7651+ addressT offset)
7652+{
7653+ struct trampoline_chain_index *idx = &ts->chain_index;
7654+ struct trampoline_chain c;
7655+
7656+ if (idx->needs_sorting)
7657+ {
7658+ qsort (idx->entry, idx->n_entries, sizeof (*idx->entry),
7659+ xg_order_trampoline_chain);
7660+ idx->needs_sorting = FALSE;
7661+ }
7662+ c.target.sym = sym;
7663+ c.target.offset = offset;
7664+ return bsearch (&c, idx->entry, idx->n_entries,
7665+ sizeof (struct trampoline_chain),
7666+ xg_order_trampoline_chain);
7667+}
7668+
7669+/* Find trampoline chain in the given trampoline segment that is going
7670+ to the *sym + *offset. If found, replace *sym and *offset with the
7671+ best jump target in that chain. */
7672+static struct trampoline_chain *
7673+xg_find_best_eq_target (struct trampoline_seg *ts,
7674+ addressT source, symbolS **sym,
7675+ addressT *offset)
7676+{
7677+ struct trampoline_chain *tc = xg_get_trampoline_chain (ts, *sym, *offset);
7678+
7679+ if (tc)
7680+ {
7681+ struct trampoline_chain_entry *e = xg_get_best_chain_entry (tc, source);
7682+
7683+ *sym = e->sym;
7684+ *offset = e->offset;
7685+ }
7686+ return tc;
7687+}
7688+
7689+static void xg_add_location_to_chain (struct trampoline_chain *tc,
7690+ symbolS *sym, addressT offset)
7691+{
7692+ struct trampoline_chain_entry *e;
7693+
7694+ if (tc->n_entries == tc->n_max)
7695+ {
7696+ tc->n_max = (tc->n_max + 1) * 2;
7697+ tc->entry = xrealloc (tc->entry, sizeof (*tc->entry) * tc->n_max);
7698+ }
7699+ e = tc->entry + tc->n_entries;
7700+ e->sym = sym;
7701+ e->offset = offset;
7702+ ++tc->n_entries;
7703+ tc->needs_sorting = TRUE;
7704+}
7705+
7706+static struct trampoline_chain *
7707+xg_create_trampoline_chain (struct trampoline_seg *ts,
7708+ symbolS *sym, addressT offset)
7709+{
7710+ struct trampoline_chain_index *idx = &ts->chain_index;
7711+ struct trampoline_chain *tc;
7712+
7713+ if (idx->n_entries == idx->n_max)
7714+ {
7715+ idx->n_max = (idx->n_max + 1) * 2;
7716+ idx->entry = xrealloc (idx->entry,
7717+ sizeof (*idx->entry) * idx->n_max);
7718+ }
7719+
7720+ tc = idx->entry + idx->n_entries;
7721+ tc->target.sym = sym;
7722+ tc->target.offset = offset;
7723+ tc->entry = NULL;
7724+ tc->n_entries = 0;
7725+ tc->n_max = 0;
7726+ xg_add_location_to_chain (tc, sym, offset);
7727+
7728+ ++idx->n_entries;
7729+ idx->needs_sorting = TRUE;
7730+
7731+ return tc;
7732+}
7733+
75237734 void dump_trampolines (void);
75247735
75257736 void
@@ -9200,9 +9411,24 @@ static void xg_relax_fixups (struct trampoline_seg *ts)
92009411 for (fx = seginfo->fix_root; fx; fx = fx->fx_next)
92019412 {
92029413 fixS *fixP = fx;
9414+ struct trampoline_chain *tc = NULL;
9415+
9416+ if (xg_is_relaxable_fixup (fixP))
9417+ {
9418+ tc = xg_find_best_eq_target (ts, fixP->fx_frag->fr_address,
9419+ &fixP->fx_addsy, &fixP->fx_offset);
9420+ if (!tc)
9421+ tc = xg_create_trampoline_chain (ts, fixP->fx_addsy,
9422+ fixP->fx_offset);
9423+ gas_assert (tc);
9424+ }
92039425
92049426 while (xg_is_relaxable_fixup (fixP))
9205- fixP = xg_relax_fixup (idx, fixP);
9427+ {
9428+ fixP = xg_relax_fixup (idx, fixP);
9429+ xg_add_location_to_chain (tc, fixP->fx_frag->fr_symbol,
9430+ fixP->fx_where);
9431+ }
92069432 }
92079433 }
92089434
@@ -9889,14 +10115,14 @@ init_trampoline_frag (fragS *fp)
988910115 return growth;
989010116 }
989110117
9892-
989310118 static int
9894-add_jump_to_trampoline (fragS *tramp, fragS *origfrag)
10119+xg_get_single_symbol_slot (fragS *fragP)
989510120 {
9896- int i, slot = -1;
10121+ int i;
10122+ int slot = -1;
989710123
989810124 for (i = 0; i < MAX_SLOTS; ++i)
9899- if (origfrag->tc_frag_data.slot_symbols[i])
10125+ if (fragP->tc_frag_data.slot_symbols[i])
990010126 {
990110127 gas_assert (slot == -1);
990210128 slot = i;
@@ -9904,10 +10130,19 @@ add_jump_to_trampoline (fragS *tramp, fragS *origfrag)
990410130
990510131 gas_assert (slot >= 0 && slot < MAX_SLOTS);
990610132
10133+ return slot;
10134+}
10135+
10136+static fixS *
10137+add_jump_to_trampoline (fragS *tramp, fragS *origfrag)
10138+{
10139+ int slot = xg_get_single_symbol_slot (origfrag);
10140+ fixS *fixP;
10141+
990710142 /* Assemble a jump to the target label in the trampoline frag. */
9908- xg_append_jump (tramp,
9909- origfrag->tc_frag_data.slot_symbols[slot],
9910- origfrag->tc_frag_data.slot_offsets[slot]);
10143+ fixP = xg_append_jump (tramp,
10144+ origfrag->tc_frag_data.slot_symbols[slot],
10145+ origfrag->tc_frag_data.slot_offsets[slot]);
991110146
991210147 /* Modify the original j to point here. */
991310148 origfrag->tc_frag_data.slot_symbols[slot] = tramp->fr_symbol;
@@ -9923,7 +10158,7 @@ add_jump_to_trampoline (fragS *tramp, fragS *origfrag)
992310158 xg_remove_trampoline_from_index (&ts->index, tr);
992410159 }
992510160
9926- return 3;
10161+ return fixP;
992710162 }
992810163
992910164
@@ -10072,15 +10307,42 @@ relax_frag_immed (segT segP,
1007210307 istack.insn[istack.ninsn - 2].opcode == xtensa_j_opcode)
1007310308 {
1007410309 TInsn *jinsn = &istack.insn[istack.ninsn - 2];
10310+ struct trampoline_seg *ts = find_trampoline_seg (segP);
10311+ struct trampoline_chain *tc = NULL;
1007510312
10076- if (!xg_symbolic_immeds_fit (jinsn, segP, fragP, fragP->fr_offset, total_text_diff))
10313+ if (ts &&
10314+ !xg_symbolic_immeds_fit (jinsn, segP, fragP, fragP->fr_offset,
10315+ total_text_diff))
10316+ {
10317+ int s = xg_get_single_symbol_slot (fragP);
10318+ addressT offset = fragP->tc_frag_data.slot_offsets[s];
10319+
10320+ tc = xg_find_best_eq_target (ts, fragP->fr_address,
10321+ &fragP->tc_frag_data.slot_symbols[s],
10322+ &offset);
10323+
10324+ if (!tc)
10325+ tc = xg_create_trampoline_chain (ts,
10326+ fragP->tc_frag_data.slot_symbols[s],
10327+ offset);
10328+ fragP->tc_frag_data.slot_offsets[s] = offset;
10329+ tinsn_immed_from_frag (jinsn, fragP, s);
10330+ }
10331+
10332+ if (!xg_symbolic_immeds_fit (jinsn, segP, fragP, fragP->fr_offset,
10333+ total_text_diff))
1007710334 {
1007810335 fragS *tf = xg_find_best_trampoline_for_tinsn (jinsn, fragP);
1007910336
1008010337 if (tf)
1008110338 {
10082- this_text_diff += init_trampoline_frag (tf);
10083- this_text_diff += add_jump_to_trampoline (tf, fragP);
10339+ fixS *fixP;
10340+
10341+ this_text_diff += init_trampoline_frag (tf) + 3;
10342+ fixP = add_jump_to_trampoline (tf, fragP);
10343+ xg_add_location_to_chain (tc, fixP->fx_frag->fr_symbol,
10344+ fixP->fx_where);
10345+ fragP->tc_frag_data.relax_seen = FALSE;
1008410346 }
1008510347 else
1008610348 {
--- a/gas/testsuite/gas/xtensa/all.exp
+++ b/gas/testsuite/gas/xtensa/all.exp
@@ -99,6 +99,7 @@ if [istarget xtensa*-*-*] then {
9999 run_dump_test "weak-call"
100100 run_dump_test "jlong"
101101 run_dump_test "trampoline"
102+ run_list_test "trampoline-2"
102103 run_dump_test "first_frag_align"
103104 run_dump_test "auto-litpools"
104105 run_dump_test "auto-litpools-first1"
--- /dev/null
+++ b/gas/testsuite/gas/xtensa/trampoline-2.l
@@ -0,0 +1 @@
1+# No warnings or errors expected!
--- /dev/null
+++ b/gas/testsuite/gas/xtensa/trampoline-2.s
@@ -0,0 +1,20 @@
1+ .text
2+
3+ .rep 4000
4+ bnez a2, .L1
5+ .endr
6+
7+ .rep 100000
8+ _nop
9+ .endr
10+
11+.L1:
12+ j .L1
13+
14+ .rep 100000
15+ _nop
16+ .endr
17+
18+ .rep 4000
19+ bnez a2, .L1
20+ .endr
--- a/gas/testsuite/gas/xtensa/trampoline.d
+++ b/gas/testsuite/gas/xtensa/trampoline.d
@@ -5,29 +5,28 @@
55 .*: +file format .*xtensa.*
66 #...
77 .*0:.*j.0x1194c
8-.*3:.*j.0x1194f
9-.*6:.*j.0x11952
10-.*9:.*j.0x11955
8+.*3:.*j.0x1194c
9+.*6:.*j.0x1194f
10+.*9:.*j.0x11952
1111 #...
12-.*11949:.*j.0x11958
13-.*1194c:.*j.0x24a0b
12+.*11949:.*j.0x11955
13+.*1194c:.*j.0x24a08
1414 .*1194f:.*j.0x24a0b
15-.*11952:.*j.0x24a0e
16-.*11955:.*j.0x2d687
15+.*11952:.*j.0x2d684
1716 #...
17+.*24a08:.*j.0x24a08
1818 .*24a0b:.*j.0x24a0b
19-.*24a0e:.*j.0x24a0e
2019 #...
21-.*2d684:.*ret
22-.*2d687:.*j.0x49404
20+.*2d681:.*ret
21+.*2d684:.*j.0x49401
2322 #...
24-.*49404:.*j.0x49404
25-.*49407:.*beqz.n.a2,.0x4940c
26-.*49409:.*j.0x61aa2
23+.*49401:.*j.0x49401
24+.*49404:.*beqz.n.a2,.0x49409
25+.*49406:.*j.0x61a9f
2726 #...
28-.*61aa2:.*j.0x7a13b
27+.*61a9f:.*j.0x7a138
2928 #...
30-.*7a13b:.*j.0x927f5
29+.*7a138:.*j.0x927f8
3130 #...
32-.*927f5:.*j.0x927f5
31+.*927f8:.*j.0x927f8
3332 #...
--- a/gas/testsuite/gas/xtensa/trampoline.s
+++ b/gas/testsuite/gas/xtensa/trampoline.s
@@ -25,6 +25,7 @@
2525 _ret
2626 .endr
2727 _nop
28+ _nop
2829 4:
2930 j 4b
3031