legOS : 3026


Robotics / RCX / legOS / 3026	3025 \| 3027

Subject:	sys_time (long post)
Newsgroups:	lugnet.robotics.rcx.legos
Date:	Thu, 12 Dec 2002 18:39:11 GMT
Viewed:	3720 times

As I was poking around in the BrickOS kernel, it occurred to me that using the sys_time variable isn't entirely safe. Since sys_time is 32-bits and the processor can only read 16-bits at a time into a register, there is a period of time between the instructions that read the upper and lower halves of sys_time where the interrupt that increments sys_time could fire and cause an inconsistent reading of sys_time to occur. The opportunity for the error to occur only happens every 65 seconds or so when the lower 16-bits of sys_time overflow (i.e. every 65536 milliseconds). And, even then it'll only manifest itself if the interrupt fires between the two instructions that read the upper and lower 16-bits of the sys_time variable. So, the chances of this occurring are exceedingly remote, but nonetheless possible. The bug is subtle, but if you were able to watch sys_time closely in my example program (below) you would see it increment as follows: 0x0014FFFE 0x0014FFFF 0x00140000 0x00150000 0x00150001 Normally, sys_time would be safe to use by temporarily disabling interrupts like so: disable_irqs(); end_time = sys_time + TIMEOUT; enable_irqs(); And, using sys_time in an ISR would not be a problem since interrupts are already disabled at that time. However, the kicker is that in the current version of BrickOS, the sys_time variable is updated by the watchdog timer on an NMI so the above solution won't work. It makes using sys_time in a normal ISR unsafe, as well. Possible solutions include: 1) Move the sys_time increment back to a normal IRQ and use the above code snippet when IRQs happen to be enabled. 2) Use a function that is immune to the problem to read sys_time (even when using sys_time in an ISR). Here is an example (though it's kind of a hack): extern time_t get_sys_time(); __asm__(" .text .align 1 .global _get_sys_time _get_sys_time: mov.w @_sys_time+2,r1 mov.w @_sys_time,r0 mov.w @_sys_time+2,r2 cmp r2,r1 bne _get_sys_time rts "); BTW, I did test to see if this problem can actually happen. The following is a test program that will demonstrate the problem. You may need to run it 10-20 minutes (or more) for it to fail. #include <conio.h> #include <unistd.h> #include <time.h> int main(int argc, char **argv) { time_t a, b; cputs("test"); do { a = sys_time; b = sys_time; } while (a <= b); cputs("fail"); // so we don't power down (use AC power!) while (1); return 0; } The assembly code for the above is as follows. You can see the places where it would be bad for an NMI to occur. .file "sys_time_fail.c" .section .rodata .LC0: .ascii "test\0" .LC1: .ascii "fail\0" .section .text .align 1 .global _main _main: push r4 mov.w #.LC0,r0 jsr @_cputs .L23: mov.w @_sys_time,r0 ; please no NMI here! mov.w @_sys_time+2,r1 mov.w @_sys_time,r2 ; please no NMI here! mov.w @_sys_time+2,r3 mov.w r2,r4 cmp.w r4,r0 bhi .L21 bne .L23 mov.w r3,r2 cmp.w r2,r1 bls .L23 .L21: mov.w #.LC1,r0 jsr @_cputs .L27: bra .L27 .end Thanks for reading if you made it this far! Cheers, Mark

Message has 1 Reply:

		Re: sys_time (long post)
Mark, Excellent Post ! Ok, I will admit it; I was the one that put the sys_time handler on the NMI. 8-) I guess I will want to think about this a bit; but I see that you have done a good bit of that yourself. ... Could the 'Get' function do the (...) (23 years ago, 12-Dec-02, to lugnet.robotics.rcx.legos)

6 Messages in This Thread:

Entire Thread on One Page:: Nested: All | Brief | Compact | Dots
Linear: All | Brief | Compact

Custom Search