暂停在da14581空闲pdus中

⚠️
Hi there.. thanks for coming to the forums. Exciting news! we’re now in the process of moving to our new forum platform that will offer better functionality and is contained within the main Dialog website. All posts and accounts have been migrated. We’re now accepting traffic on the new forum only - please POST any new threads at//www.wsdof.com/support。We’ll be fixing bugs / optimising the searching and tagging over the coming days.
14个帖子/ 0新
Last post
杰曼伯特
Offline
Last seen:5个月1周前
Joined:2014-10-24 14:17
暂停在da14581空闲pdus中

Hello Dialog,

We are seeing some odd behavior with some of our systems containing DA14581 chips. In our system, the DA14581 is the slave device and different vendors are supplying the master BLE chips.

On a small minority of such systems, we observe random pauses in empty PDU transmissions from the DA14581. A BLE packet sniffer trace is attached. The random dropouts are visible in the trace as the drops in throughput. Each pause is always approximately 1.1 - 1.2 seconds in duration. There is no application layer traffic being exchanged at the time of the trace - these are all idle packets.

我们的系统应该相同构建和配置。在跟踪时运行的应用程序通过IDE下载(不耗尽OTP)此行为仅在少数此类系统中展出。从等待时间设置为0(多次确认!)。

任何可能对此行为背后的想法?在这些情况下,它导致连接下降。

谢谢,

吉姆

设备:
MT_dialog
Offline
Last seen:1个月1周前
Staff
Joined:2015-06-08 11:34
Hi JamesHiebert,

Hi JamesHiebert,

因此,这种行为在共享相同构建和FW的设备上随机展出?这是正确的吗 ?我的意思是展示此行为的设备是唯一的,您的意思是您是否在多个设备上对类似的环境进行了此测试,并且只有一个(例如)设备在连续测试中暴露此行为?掌握连接的情况怎么样?是否与特定主站相关的此行为,或者,无论链接的主人如何,都会发生这种情况?

您是否有实际的嗅探器跟踪在发生事件时分享并验证此副作用是由于Sniffer日志的错过事件,因此您可以看到来自主站的数据包,没有数据包奴隶一侧?您已从吞吐量中附加了一个图表,但这并不有助于找到一个根本原因,有什么帮助是当前问题时的当前捕获,并且实际上看到581通过功耗错过了连接事件。此外,它还有助于提及您使用的SDK以及设备的配置,是应用程序完全在581上运行的应用程序,也可以使用HCI接口,我假设该应用程序完全运行在581上。你的低功耗时钟怎么样,你是用rcx还是xtal?

Thanks MT_dialog

杰曼伯特
Offline
Last seen:5个月1周前
Joined:2014-10-24 14:17
Hi Dialog,

Hi Dialog,

因此,这种行为在共享相同构建和FW的设备上随机展出?这是正确的吗 ?[Yes. A small number (few percent) of devices that have the EXACT same configuration of HW and FW exhibit the dropout behavior.]

我的意思是展示此行为的设备是唯一的,您的意思是您是否在多个设备上对类似的环境进行了此测试,并且只有一个(例如)设备在连续测试中暴露此行为?[Correct. In one particular environment, there is a lab with a dozen or so identical test stations, all with a DUT with a slave 581 chip, and only one DA14581 exhibited the behavior. We swapped the circuit board with a different board (different 581) and the problem went away. It has also happened in other environments. The behavior follows the 581 (or at least the circuit board on which it is soldered).]

掌握连接的情况怎么样?是否与特定主站相关的此行为,或者,无论链接的主人如何,都会发生这种情况?[绝大多数主设备是相同的自定义内置框。有一些异常值 - 商业智能手机。我将用一个已知的“失败”DA14581和智能手机进行实验。]

当发生事件发生时,您是否有实际的嗅探器迹象?[我附着一条带有2个辍学的痕迹。我们使用Teledyne LeCroy Thexobe协议分析系统。你能读过这些文件吗?]

......你已经验证了这个副作用是由于581的一侧错过的事件,所以在嗅探器日志上,您可以看到来自主站的数据包,从奴隶侧没有数据包?[这也是正确的。主机继续发送空闲数据包,没有来自DA14581的响应。这些间隔有点随机发生,但总是持续约1.1-1.2秒。]

您已从吞吐量中附加了一个图表,但这并不有助于找到一个根本原因,有什么帮助是当前问题时的当前捕获,并且实际上看到581通过功耗错过了连接事件。[This will be tricky with the configuration of our DUT...]

此外,它还有助于提及您使用的SDK以及设备的配置,是应用程序完全在581上运行的应用程序,也可以使用HCI接口,我假设该应用程序完全运行在581上。[581上运行的FW基于5.0.4 SDK,但它已被严重修改。然而,大多数arch * .c文件都是不受欢迎的。没有使用HCI。在我们的系统中,有一个应用程序在581上运行,以及在不同的IC上运行的另一个应用程序。这两个芯片通过SPI与自定义协议进行通信,而不是HCI。]

你的低功耗时钟怎么样,你是用rcx还是xtal?[RCX]

杰曼伯特
Offline
Last seen:5个月1周前
Joined:2014-10-24 14:17
一些额外的信息。这

一些额外的信息。暂停似乎随着更长的连接间隔消失。例如,使用47.5ms CI,在超过7小时内丢弃了一行中不超过1个从空闲数据包。我正在以更短的间隔运行进一步的测试。

My observations in the previous message had a CI of 11.25ms programmed.

MT_dialog
Offline
Last seen:1个月1周前
Staff
Joined:2015-06-08 11:34
Hi JamesHiebert,

Hi JamesHiebert,

这fact that you are able to re-establish communication after more than 1 sec of inactivity of the slave means that you should be ok on the clocks side and there is no reset while the issue occurs. I dont see anything peculiar from the master side, a previous transaction that might went wrong and affected the slave for example, the only things i can think of that might result in this condition are:

  • Either that he sw is executing some code with disabled interrupts and/or does not call the scheduler in a timely manner (at least once between events). The fact that by increasing the intervals this insident doesn't occur i would say this is the most probable cause.
  • 这wake up sequence takes more than expected (clock settling or code has been added in wake-up sequence) and the device reaches the programming of the event too late (there is an assertion in the SLP handler in the power_up() function, is that assertion still exists on your code ?)

As a result the device misses events. A current trace would help use identify that, since if the device is occupied via running code we should see a flat line just consuming and if the startup FSM took extra time we should see wakes ups and no radio activity at all (since if the device had woken up to late for servicing the event it wont start a radio transaction).

What would also help, would be if you try to run a simple example and check if you can replicate this on your hw, for example the proximity reporter from the SDK. Also, does the device has any activity while the incident occurs, you mention that its connected to an external device and communicates via SPI, so is there any action on the bus during that period or perhaps a pattern that can relate the SPI transaction with the issue? Perhaps there is some kind of correlation between the issue and the communication of the 2 devices.

Thanks MT_dialog

杰曼伯特
Offline
Last seen:5个月1周前
Joined:2014-10-24 14:17
Hi Dialog,

Hi Dialog,

Sorry for the long delay. We've been collecting some data on the issue which may help you help us.

Here's a summary of what we know:
•停顿只发生在设备(5%的一个子集)? All devices are running the same application SW.
•发生暂停时:
○随机发生
○ The duration of the pause depends on the master's SCA (see Dialog Drop-outs by CI.jpg in attached).
○ The pauses only occur if the Connection Interval is a multiple of 11.25ms (22.5ms, 45ms,...). If other CI's are negotiated, no drop-outs (see Dialog Drop-outs by CI.jpg in attached zip).
○当前暂停时,电流跟踪显示一致的时序,但是当暂停时,没有当前尖峰(表示Rx和TX活动)(请参阅附加的zip中的stameeause_nopause.png)。上部迹线显示CE没有暂停(代表空闲TX和RX活动的两个高尖峰),下面显示CE的暂停。时间看起来一致。

It could be something our application SW is doing (or not doing), but it isn't obvious to us. Thanks for any help you can provide!

Attachment:
杰曼伯特
Offline
Last seen:5个月1周前
Joined:2014-10-24 14:17
“唤醒序列需要

“唤醒序列需要超过预期的(时钟沉降或代码已在唤醒序列中添加),并且设备到达事件的编程太晚(Power_up()函数中的SLP处理程序中存在断言,是断言仍然存在于您的代码上吗?)“

Regarding your question above about the assertion, is this the code in question? I modified the ASSERT_WARNING to ASSERT_ERROR to cause a very visible reset should this condition occur.

When pauses are occurring, the ASSERT_ERROR below does NOT occur.

/ *
* Check if BLE_SLP_IRQ has already asserted. In this case, we delayed in periph_init().
*增加LP_ISR_TIME_XTAL32_CYCLES和LP_ISR_TIME_USEC值以提供更多的执行时间
*到periph_init()。
*/
如果(GetBits32 (BLE_INTSTAT_REG SLPINTSTAT))
ASSERT_ERROR(0);

MT_dialog
Offline
Last seen:1个月1周前
Staff
Joined:2015-06-08 11:34
嗨jameshiebert,

嗨jameshiebert,

From the traces that you have attached it seems that the device properly wakes up, but apparently the time for the Rx/Tx event is wrong, most probably delayed, this is the only explanation that i can come up with for not having RF activity on certain wake ups. Although if this time was wrong then the assertion that i ve mentioned should hit, also can you please check if the second assertion occurs in the lld_sleep_compensate_func_patched(). Also is there any activity on the SPI device that communicates with the 581 while the issue occurs, perhaps what you are experiencing is somekind of a race condition, since i am not able to justify why when using multiples of the 11.25 connection interval the device will behave like that.

Thanks MT_dialog

杰曼伯特
Offline
Last seen:5个月1周前
Joined:2014-10-24 14:17
你好,对话,

你好,对话,

We are still looking into this issue. Some new learning, perhaps this will help you help us?

休眠禁用不会发生暂停。

当丢失发生时,没有断言,包括lld_sleep_compensate_func_patched()中的

I modified the existing rwble.c "last event" logging to record the events to a large circular buffer, then read the buffer when a drop out occurred.

当一切都很好,我看到ble_evt_lp,ble_evt_slp,ble_evt_cscnt,ble_evt_rx,ble_evt_tx,ble_evt_end重复。

When drop outs occur, I see only BLE_EVT_LP, BLE_EVT_SLP, BLE_EVT_CSCNT repeating.

I will check for SPI activity, but there shouldn't be any....

杰曼伯特
Offline
Last seen:5个月1周前
Joined:2014-10-24 14:17
Also please note that setting

另请注意,将USE_POWER_OPTIMIZITATIANTS设置为0扩展休眠状态似乎消除了暂停/丢失。我们不使用深睡眠。

MT_dialog
Offline
Last seen:1个月1周前
Staff
Joined:2015-06-08 11:34
Hi JamesHierbert,

Hi JamesHierbert,

您只看到BLE_EVT_LP,BLE_EVT_SLP,BLE_EVT_CSCNT时,就会发现问题时确认设备正确唤醒但没有BLE
activity, as far as i am aware and as mentioned in previous posts, the only reason i can think of is if there are delays before the RX/TX activities that will prevent the device on waking up on time. Since you reported that you are not able to see any ASSERTIONS from the allready debugging structures that the SDK has, then there must be a delay between the BLE_EVT_SLP and the BLE_EVT_CSCNT. From the captures that you have provide i am not able to track the CSCNT event unless is the CSCNT part is executed in the last bump i see in the problematic trace (which in that case loosing the event is justified). It seems from the trace that time between the SLP event and the CSCNT is significantly larger than the proper trace. So right after the SLP then the main while function of the SDK has the opportunity to run, are you certain that you dont close the interrupts under some condition in your application code ? Can you please take some time measurements following the instructions below:

In the BLE_CSCNT_Handler() try to log the following values when the issue occurs, evt->time (time that the next event should be scheduled) along with the current time count, to do that you can use the below snippet:

用于测试目的,我使用了两个16字节长度阵列。

struct lld_evt_tag *evt = (struct lld_evt_tag *)co_list_pick(&lld_evt_env.evt_prog);
time1_log[time_idx]=evt->time;
time2_log[time_idx]= lld_evt_time_get();
time_idx ++;
time_idx&= 0xF;

通常,从时间1_log收集的值应该是未来的至少2个插槽,而不是当前时间值(lld_evt_time_get()time2_log)。如果不是那么意味着这意味着某种东西(大多数可能来自应用程序级别)正在停止BLE事件的编程,并且显然从它延迟时不会发生BLE事件。

Thanks MT_dialog

杰曼伯特
Offline
Last seen:5个月1周前
Joined:2014-10-24 14:17
Hello Dialog,

Hello Dialog,

Sorry for slow responses. I am working on collecting the data you've requested, but its a bit of a challenge because of the lag between the actual pause and when I notice it on the sniffer. I need sufficient buffer space to record the timing data back that far and memory is running low! I will continue to work on this.

同时,我发现了以下内容:如果我在连接期间禁用SPI,则丢失停止。

As you know, every time BLE wakes up from extended sleep the peripherals must be reinitialized. In our system there is an app running on the 581 as well as another app running on a different IC. The two chips communicate via SPI with a custom protocol. I put logic in our 581 app to stop re-enabling SPI after extended sleeping. Specifically when the following line of code is no longer run, I see that the pauses immediately stop.

setbits16(spi_ctrl_reg,spi_on,1);

我还尝试屏蔽SPI中断并停止在系统中的BLE和其他芯片之间的其他GPIO中断(见下文),但在这些情况下暂停继续。上面是否则跳过似乎阻止暂停的唯一外围重新初始化步骤。
•与启用SPI中断无关
• Not related to signaling the other chip via GPIO.
•与SPI内部时钟分频器设置无关(CLK_PER_REG从/ 2到/ 4更改)
• Not related to SPI interrupt priority (increased from 20 to 3)
• Not related to CLK_PER_REG enable
• Not related to configuring the GPIO pins (input/output, function, etc.) for SPI

杰曼伯特
Offline
Last seen:5个月1周前
Joined:2014-10-24 14:17
Hello Dialog,

Hello Dialog,

请在The The Contrated在Thu,2018-07-12 10:36帖子中提出收集的时序数据,查找附件。

MHv_Dialog
Offline
Last seen:1个月6天前
Staff
Joined:2013-12-06 15:10
Hi Jim,

Hi Jim,

I will be dealing with you offline on this issue. For others following this issue we will makesure to log our results.

/ mhv.