To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.robotics.rcx.legosOpen lugnet.robotics.rcx.legos in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Robotics / RCX / legOS / 3026
3025  |  3027
Subject: 
sys_time (long post)
Newsgroups: 
lugnet.robotics.rcx.legos
Date: 
Thu, 12 Dec 2002 18:39:11 GMT
Viewed: 
2722 times
  
As I was poking around in the BrickOS kernel, it
occurred to me that using the sys_time variable isn't
entirely safe.  Since sys_time is 32-bits and the
processor can only read 16-bits at a time into a
register, there is a period of time between the
instructions that read the upper and lower halves
of sys_time where the interrupt that increments
sys_time could fire and cause an inconsistent reading
of sys_time to occur.

The opportunity for the error to occur only happens
every 65 seconds or so when the lower 16-bits of
sys_time overflow (i.e. every 65536 milliseconds).
And, even then it'll only manifest itself if the interrupt
fires between the two instructions that read the upper
and lower 16-bits of the sys_time variable.  So, the
chances of this occurring are exceedingly remote, but
nonetheless possible.

The bug is subtle, but if you were able to watch sys_time
closely in my example program (below) you would see it
increment as follows:

0x0014FFFE
0x0014FFFF
0x00140000
0x00150000
0x00150001

Normally, sys_time would be safe to use by temporarily
disabling interrupts like so:

  disable_irqs();
  end_time = sys_time + TIMEOUT;
  enable_irqs();

And, using sys_time in an ISR would not be a problem
since interrupts are already disabled at that time.

However, the kicker is that in the current version of
BrickOS, the sys_time variable is updated by the
watchdog timer on an NMI so the above solution won't
work.  It makes using sys_time in a normal ISR unsafe,
as well.

Possible solutions include:

1) Move the sys_time increment back to a normal
IRQ and use the above code snippet when IRQs
happen to be enabled.

2) Use a function that is immune to the problem to
read sys_time (even when using sys_time in an ISR).
Here is an example (though it's kind of a hack):

extern time_t get_sys_time();
__asm__("
.text
.align 1
.global _get_sys_time
_get_sys_time:
  mov.w @_sys_time+2,r1
  mov.w @_sys_time,r0
  mov.w @_sys_time+2,r2
  cmp r2,r1
  bne _get_sys_time
  rts
");

BTW, I did test to see if this problem can actually
happen.  The following is a test program that
will demonstrate the problem.  You may need to run
it 10-20 minutes (or more) for it to fail.

#include <conio.h>
#include <unistd.h>
#include <time.h>

int main(int argc, char **argv)
  {
  time_t a, b;

  cputs("test");

  do
    {
    a = sys_time;
    b = sys_time;
    }
  while (a <= b);

  cputs("fail");

  // so we don't power down (use AC power!)
  while (1);

  return 0;
  }

The assembly code for the above is as follows.
You can see the places where it would be bad
for an NMI to occur.

.file "sys_time_fail.c"
.section .rodata
.LC0:
.ascii "test\0"
.LC1:
.ascii "fail\0"
.section .text
.align 1
.global _main
_main:
push r4
mov.w #.LC0,r0
jsr @_cputs
.L23:
mov.w @_sys_time,r0
; please no NMI here!
mov.w @_sys_time+2,r1
mov.w @_sys_time,r2
; please no NMI here!
mov.w @_sys_time+2,r3
mov.w r2,r4
cmp.w r4,r0
bhi .L21
bne .L23
mov.w r3,r2
cmp.w r2,r1
bls .L23
.L21:
mov.w #.LC1,r0
jsr @_cputs
.L27:
bra .L27
.end

Thanks for reading if you made it this far!

Cheers,

Mark



Message has 1 Reply:
  Re: sys_time (long post)
 
Mark, Excellent Post ! Ok, I will admit it; I was the one that put the sys_time handler on the NMI. 8-) I guess I will want to think about this a bit; but I see that you have done a good bit of that yourself. ... Could the 'Get' function do the (...) (22 years ago, 12-Dec-02, to lugnet.robotics.rcx.legos)

6 Messages in This Thread:


Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR