Hardfaults from reading/writing to an unaligned address?

14 posts / 0 new
Last post
brian
Offline
Last seen:6 years 3 months ago
Expert Master
Joined:2014-10-16 18:10
Hardfaults from reading/writing to an unaligned address?

I note that the Cortex Generic user's guide states that a hardfault can arise as follows:

an attempted load or store to an unaligned address

Is this true for the DA14580? If so, what is considered aligned? 8 bits? 16 bits? 32 bits?

If one has 16 bit alignment do I need to be careful how I define structs or does the compiler properly pad for me?

How could I invoke such an error using C- code that the compiler won't protect? memcpy, pointers?

PY_Dialog
Offline
Last seen:2 years 11 months ago
Staff
Joined:2014-08-25 09:59
Hi Brian,

Hi Brian,

I think if you used C code, there should not be align issue because C compiler will do the correction. There could be mulfunction if you don't pay care on align, but should not cause memory issue.

Regards!
PY

brian
Offline
Last seen:6 years 3 months ago
Expert Master
Joined:2014-10-16 18:10
But I can consistently cause

But I can consistently cause the error by removing an element in a structure that is deleclared and implemented (but not used). The fault is triggered by the STRH instruction. The map shows that the structure (mds_data) is placed just before some arch_main code as shown below:

rwip_rf 0x0008071c Data 0 rom_symdef.txt ABSOLUTE
mds_data 0x00080768 Data 248 app.o(.bss)
cs_area$$Base 0x00080860 Number 0 arch_main.o(cs_area)
cs_table 0x00080860 Data 546 arch_main.o(cs_area)

Playing with unused elements of that structure will trigger the error on the STRH instruction. Interestingly, none of the code I have access to (except arch_main) is executed. I can place a break point in the arch_main while loop and step through that loop but the place in the application code that normally gets hot first (app_init() in app.c) never gets hit. It can take up to 10 seconds or so before the hard fault appears.

VesaN
Offline
Last seen:5 years 7 months ago
Guru Master
Joined:2014-06-26 08:49
Hello brian,

Hello brian,

can you share the structure definition?

brian
Offline
Last seen:6 years 3 months ago
Expert Master
Joined:2014-10-16 18:10
It is here

It is here
struct mds_data_tag
{
unsigned char systemId[8];
unsigned char manufacturerName[32];
unsigned char modelNumber[32];
unsigned char serialNumber[32];
unsigned char firmwareRevision[32];
unsigned char hardwareRevision[32];
unsigned char softwareRevision[32];
unsigned char pnpId[8];
unsigned char continuaMajorVersion;
unsigned char continuaMinorVersion;
unsigned char numberOfCertifications;
unsigned short *certs;
unsigned short regStatus;
unsigned char batteryLevel;
unsigned char lengthOfTimeData;
无符号字符agentCurrentTime [8];
unsigned long accuracy;
unsigned short timeSyncMethod;
unsigned char ahdCurrentTime[8];
unsigned short ahdAccuracy;
unsigned short ahdTimeSyncMethod;
};

Only the strings at the start of the struct are currently used in code. However, I never get to app_init() so that is a moot point. The error is generated by removing certain elements after char pnpId[8] and keeping the set of 'strings' the same. An example that caused the behavior is

struct mds_data_tag
{
unsigned char systemId[8];
unsigned char manufacturerName[32];
unsigned char modelNumber[32];
unsigned char serialNumber[32];
unsigned char firmwareRevision[32];
unsigned char hardwareRevision[32];
unsigned char softwareRevision[32];
unsigned char pnpId[8];
unsigned char continuaMajorVersion;
};

gcblair
Offline
Last seen:4 years 10 months ago
Master
Joined:2014-09-08 10:21
Brian,

Brian,

I would be interested to hear if you found a solution for this. It seems that Dialog are a bit slow this week

brian
Offline
Last seen:6 years 3 months ago
Expert Master
Joined:2014-10-16 18:10
I have not found solution

I have not found solution but I have found that there tends to be trouble (hard faults) whenever an array is placed before something system-critical (like one of the system heaps). I have viewed this in the map. I have seen this rather consistently for several array cases. A problem may be that I am running at the highest optimization level and am on the verge of running out of system space to the point that I have to run in the optimized modes. If I can play in such a way that these arrays are not next to one of these heaps or other areas of system data, the crash goes away. There is another array that we were working with that was placed right before the non-retention heap when it was larger than 100 bytes. At that point we got that error. At 100 bytes the map shifted significantly and the code ran. We still can't figure it out as the buffer is used only in one area and it's pretty simple. The down side is one has no control over how things are placed in memory at that detail (...and I wouldn't want to have to either).

Yutaka
Offline
Last seen:3 years 4 months ago
Joined:2016-08-02 09:29
We have some similar issue.

We have some similar issue.
Do you have any progress about this issue?

MT_dialog
Offline
Last seen:3 months 6 days ago
Staff
Joined:2015-06-08 11:34
Hi Yutaka,

Hi Yutaka,

There is no support for unaligned accesses on the Cortex-M0 processor. Any attempt to perform an unaligned memory access operation results in a Hard fault exception.

More info can be found on the below link.

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0497a/BAB...

由于MT_dialog

Yutaka
Offline
Last seen:3 years 4 months ago
Joined:2016-08-02 09:29
But in the case of our

But in the case of our product, Hard fault occurred in rwip_schedule().
Does Hard fault occurred by wrong parameters?

MT_dialog
Offline
Last seen:3 months 6 days ago
Staff
Joined:2015-06-08 11:34
Hi Yutaka,

Hi Yutaka,

There are quite a few reasons that a hard fault might occur.

How do you know that the hard fault occurs during the rwip_schedule() ? Have you checked the PC, and the address before the proccessor stalls (in the hardfault handler debugging implementation) coresponds to the rwip_schedule() function in the map file ? To what parameters are you refering to ? The rwip_schedule() is scheduling the messages that are received by the stack or by your application and invokes the corresponding handler, its a bit unlikely that this is the function that causes the Hardfault_handler() to occur.

由于MT_dialog

Yutaka
Offline
Last seen:3 years 4 months ago
Joined:2016-08-02 09:29
我在不断步检查n.

我在不断步检查n.
And I checked stack memory.
I found detail address on occurred Hard Fault.

Address: 0x324CC
LDRH r3,[r1,r3]

R1:0x00000003
R3:0x0003E350

CPU load from Illegal address.

from MAP file.
> ke_queue_extract 0x00032441 Thumb Code 0 rom_symdef.txt ABSOLUTE
> ke_queue_insert 0x0003247f Thumb Code 0 rom_symdef.txt ABSOLUTE
> ke_task_init 0x0003256d Thumb Code 0 rom_symdef.txt ABSOLUTE

The address (0x324CC) is within ke_queue_insert.
Does ke_queue_insert is using which parameter?
Can I replace this function for debug?

Attachment:
Yutaka
Offline
Last seen:3 years 4 months ago
Joined:2016-08-02 09:29
I analyzed the stack memory.

I analyzed the stack memory.
The address (0x324CC) isn't ke_queue_insert.
May be this address is sub routine of ke_task_handler_get.

Stack memory
00000000 R0
00000003 R1
00007C6A R2
0003E350 R3
00081004 R12
0003274D LR
000324CC PC
01000000
20006378 R4
0003274D LR ke_task_handler_get
00000000
00000000 R0
00000000 R1
000817A8 R4
00000000 R5
50000020 R6
00000004 R7
200055F3 LR patched_gapm_adv_op_sanity
000805E4 R4
00000000 R5
50000020 R6
00032181 LR ke_event_schedule
50000000 R4
00032DD9 LR rwip_schedule
50000000 R4
20000671 LR main_func
00000000
00000000
20006B1C
20005C01

备注(从映射文件)
ke_event_schedule 0x0003213d Thumb Code 0 rom_symdef.txt ABSOLUTE
ke_task_handler_get 0x000326eb Thumb Code 0 rom_symdef.txt ABSOLUTE
rwip_schedule 0x00032dc9 Thumb Code 0 rom_symdef.txt ABSOLUTE
main_func 0x2000058b Thumb Code 592 arch_main.o(.text)

MT_dialog
Offline
Last seen:3 months 6 days ago
Staff
Joined:2015-06-08 11:34
Hi Yutaka,

Hi Yutaka,

I see from the picture that you ve uploaded that you been tracing functions through the stack. When in hardfault or NMI the SDK stores that last values of the registers in two different addresses either the 0x81800 and 0x81850 one for Hardfault and another for the NMI. From there you can find the address that your code hit the hardfault without making any effort to track back from the stack. Now as far as i can tell the hardfault hit, it traces back to functions that are used by the system constantly in order to schedule events. Most probable reason for this to happen is data corruption, perhaps an invalid handler pointer or something like that. Also if you can figure out any conditions when this is happening, you can find the reason that causes this kind of behaviour in your application, try to find if this error happens in some specific pattern, or when performing a specific action from your application.

由于MT_dialog