Skip to content

Conversation

@TeroJaasko
Copy link
Contributor

Description

The thread stacks and other large variables in rtx_lib.c were directed
to .bss.os section, which puts them to zero initialized section in GCC,
but on ARMC5 they were put to data section which consumes precious ROM.

In reality the data section compression may make this a NOP change,
but it will at least make the total results now more reliable.

Effects of this PR on the output of mbed compile of one application:

--8<--8<--
--- memory_before_bss_change.txt	2018-06-18 16:45:51.928844313 +0300
+++ memory_after_bss_change.txt	2018-06-18 16:57:27.753532437 +0300
@@ -5,6 +5,9 @@
 +----------------------------------------------------------+--------+-------+-------+
 | Module                                                   |  .text | .data |  .bss |
 +----------------------------------------------------------+--------+-------+-------+
@@ -366,7 +369,7 @@
 | mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_kernel.o  |    801 |   164 |     0 |
-| mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_lib.o     |    406 |  2117 |     0 |
+| mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_lib.o     |    406 |     1 |  2116 |
 | mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_memory.o  |    260 |     0 |     0 |
@@ -412,9 +415,9 @@
-| Subtotals                                                | 163559 |  3380 | 15168 |
+| Subtotals                                                | 163559 |  1264 | 17284 |
 +----------------------------------------------------------+--------+-------+-------+
 Total Static RAM memory (data + bss): 18548 bytes
-Total Flash memory (text + data): 166939 bytes
+Total Flash memory (text + data): 164823 bytes

Pull request type

[x] Fix
[ ] Refactor
[ ] New target
[ ] Feature
[ ] Breaking change

@TeroJaasko
Copy link
Contributor Author

This works on my K64F, not tested on other platforms. In practice this will change the relative order of the variables in RAM, but that should not break things.

The thread stacks and other large variables in rtx_lib.c were directed
to .bss.os section, which puts them to zero initialized section in GCC,
but on ARMC5 they were put to data section which consumes precious ROM.

In reality the data section compression may make this a NOP change,
but it will at least make the total results now more reliable.

Effects of this PR on the output of mbed compile of one application:
--8<--8<--
--- memory_before_bss_change.txt	2018-06-18 16:45:51.928844313 +0300
+++ memory_after_bss_change.txt	2018-06-18 16:57:27.753532437 +0300
@@ -5,6 +5,9 @@
 +----------------------------------------------------------+--------+-------+-------+
 | Module                                                   |  .text | .data |  .bss |
 +----------------------------------------------------------+--------+-------+-------+
@@ -366,7 +369,7 @@
 | mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_kernel.o  |    801 |   164 |     0 |
-| mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_lib.o     |    406 |  2117 |     0 |
+| mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_lib.o     |    406 |     1 |  2116 |
 | mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_memory.o  |    260 |     0 |     0 |
@@ -412,9 +415,9 @@
-| Subtotals                                                | 163559 |  3380 | 15168 |
+| Subtotals                                                | 163559 |  1264 | 17284 |
 +----------------------------------------------------------+--------+-------+-------+
 Total Static RAM memory (data + bss): 18548 bytes
-Total Flash memory (text + data): 166939 bytes
+Total Flash memory (text + data): 164823 bytes
@0xc0170
Copy link
Contributor

0xc0170 commented Jun 19, 2018

@JonatanAntoni Please review

Copy link
Member

@bulislaw bulislaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be done in RTX not Mbed OS, we try not to modify CMSIS/RTX but pull changes from upstream.

@JonatanAntoni
Copy link
Member

@TeroJaasko, can you confirm that you use the linker option --datacompressor off with ARMCC5? If not I cannot see why then difference is that huge. What I can see using a minimal example is that the linker moves a variable from ZI to RW as soon as I manually assign it to a section, i.e. add section attribute.

@kjbracey
Copy link
Contributor

If not I cannot see why then difference is that huge.

It isn't at present. The stats for "flash size" are misleading, because of the compression.

However, there is actually very little (really) initialised data - so little that the decompression code itself would potentially not be needed for a minimal image, so Tero is partly chasing a potential saving of omitting it. I understand that this will happen automatically once the initialised data is small enough.

the linker moves a variable from ZI to RW as soon as I manually assign it to a section, i.e. add section attribute.

The ARM linker works primarily by section attributes, not names, so when you name a section for a data object it has to assume that the named section is potentially initialised - this variable isn't, but maybe something else in the section might be. You need to put the "zeroinit" attribute on to assure it that the entire section is zero-init, not just this variable.

@TeroJaasko
Copy link
Contributor Author

TeroJaasko commented Jun 20, 2018

@JonatanAntoni : No, I use just the default Mbed OS' release profile. The same kind of numbers can be reproduced with mbed-os-blinky:

--- stock_master.txt	2018-06-20 11:50:47.535629894 +0300
+++ bss_opt.txt	2018-06-20 11:52:08.629442184 +0300
@@ -143,7 +143,7 @@
 | mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_kernel.o             |   745 |   164 |    0 |
-| mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_lib.o                |   406 |  2117 |    0 |
+| mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_lib.o                |   406 |     1 | 2116 |
 | mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_memory.o             @@ -175,9 +175,9 @@
-| Subtotals                                                           | 32894 |  2565 | 7140 |
+| Subtotals                                                           | 32894 |   449 | 9256 |
 +---------------------------------------------------------------------+-------+-------+------+
 Total Static RAM memory (data + bss): 9705 bytes
-Total Flash memory (text + data): 35459 bytes
+Total Flash memory (text + data): 33343 bytes
  Image: ./BUILD/K64F/ARM/mbed-os-example-blinky.bin

Kevin's assessment matches my observations too, although I'm not aiming to get rid of compression code, just to put the binary flash size below target.;) The Flash memory numbers Mbed OS shows do not take the section compression into account, they are just .text+.data numbers added together. Also the padding and compiler generated veneers seems to be omitted, but those are not that meaningful.

Just for the reference, here is mbed compile output:

+-------------+-------+-------+------+
| Module      | .text | .data | .bss |
+-------------+-------+-------+------+
| [lib]       | 10739 |    16 |  348 |
| anon$$obj.o |    32 |     0 | 1024 |
| main.o      |    60 |     4 |    0 |
| mbed-os     | 22063 |  2545 | 5768 |
| Subtotals   | 32894 |  2565 | 7140 |
+-------------+-------+-------+------+
Total Static RAM memory (data + bss): 9705 bytes
Total Flash memory (text + data): 35459 bytes
Image: ./BUILD/K64F/ARM/mbed-os-example-blinky.bin

and ARMCC's map for same build:

Image component sizes

      Code (inc. data)   RO Data    RW Data    ZI Data      Debug   
     19526       2396       2688       2560       6792      46436   Object Totals
         0          0         32          0       1024          0   (incl. Generated)
        52          0          7         11          0          0   (incl. Padding)
     10202        446        556         16        348       7652   Library Totals
        14          0          5          0          0          0   (incl. Padding)

==============================================================================

      Code (inc. data)   RO Data    RW Data    ZI Data      Debug   
     29728       2842       3244       2576       7140      14084   Grand Totals
     29728       2842       3244         56       7140      14084   ELF Image Totals (compressed)
     29728       2842       3244         56          0          0   ROM Totals
==============================================================================

    Total RO  Size (Code + RO Data)                32972 (  32.20kB)
    Total RW  Size (RW Data + ZI Data)              9716 (   9.49kB)
    Total ROM Size (Code + RO Data + RW Data)      33028 (  32.25kB)

==============================================================================

@yogpan01
Copy link
Contributor

@bulislaw can you provide link for the repo where we should open this PR?

@0xc0170
Copy link
Contributor

0xc0170 commented Jun 25, 2018

@bulislaw can you provide link for the repo where we should open this PR?

https://github.com/ARM-software/CMSIS_5/

@TeroJaasko
Copy link
Contributor Author

@bulislaw : fair enough. Created a PR for CMSIS_5 repo on ARM-software/CMSIS_5#382

@cmonr
Copy link
Contributor

cmonr commented Jul 3, 2018

@bulislaw Since the PR is now open against CMSIS, should this one be closed?

@deepikabhavnani
Copy link

@cmonr - Updates from CMSIS to Mbed OS take time, we can merge this PR once it is approved in CMSIS and not wait for full CMSIS/RTX update.

@cmonr
Copy link
Contributor

cmonr commented Jul 3, 2018

@deepikabhavnani Fine by me.

@cmonr
Copy link
Contributor

cmonr commented Jul 16, 2018

@deepikabhavnani Any idea how responsive the CMSIS maintainers tend to be? Looking at that repo's activity, I have a feeling this PR will close before any activity is seen in that repo.

@cmonr
Copy link
Contributor

cmonr commented Aug 9, 2018

Reviving conversation on this PR.

Changes are unlikely to be pulled into CMSIS because they're comiler-specific. If we want to bring these in, we'll need to support them as a part of the Mbed OS repo.

@JonatanAntoni Does that sound about right?

@TeroJaasko @0xc0170 @kjbracey-arm @bulislaw @theotherjimmy @deepikabhavnani @SenRamakri Thoughts?

@cmonr
Copy link
Contributor

cmonr commented Aug 24, 2018

@c1728p9 Do you know how the CMSIS update PR interacts with this potential fix? Is this PR still needed?

@deepikabhavnani
Copy link

CMSIS update would not be touching this fix, as it is not yet integrated into CMSIS.

@TeroJaasko - Shared updated fix by mail, but it is yet not pushed to CMSIS and mbed-os repo. Please apply the update fix to PR's

@TeroJaasko
Copy link
Contributor Author

@deepikabhavnani : CMSIS PR is there: ARM-software/CMSIS_5#407

@deepikabhavnani
Copy link

@TeroJaasko - Since the fix is approved in CMSIS (ARM-software/CMSIS_5#407 (comment)), we can update and merge here.

If you will like to wait for CMSIS merge and update that would be likely in 5.11 (CMSIS updates are merged in mbed-os once every minor release)

@deepikabhavnani
Copy link

As per last update, changes in CMSIS were reverted as it was breaking change for existing projects.
Shall we close it in mbed-os till we have some resolution from CMSIS?

@JonatanAntoni
Copy link
Member

Hi @deepikabhavnani,

I admit this is not an ideal situation. Currently we don't see how we could fix this issue. Any solution proposal we discussed so far would lead to negative impact under certain conditions. It looks like there is no global solution to force the sections into ZI memory without touching every occurrence.

You can apply the patch (adding attribute((zero_init))) locally for mbedOS but I guess we won't be able to upstream that. Please keep in mind to add this everywhere you create control blocks outside the RTX-controlled memory. RTX aware debugging (at least in MDK) relies on these section names.

Cheers,
Jonatan

@0xc0170
Copy link
Contributor

0xc0170 commented Sep 21, 2018

Thanks @JonatanAntoni for the update here.

I'll close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants